1 Introduction

The microbiome, defined as the complex community of diverse microorganisms that inhabit the human body, along with their area of influence (including interactions, metabolites, and genetic material), has recently gained significant attention due to its profound impact on health and disease. As highlighted by Hou et al. (2022), the microbiota composition is diverse and differs between different sites of the body (e.g., the gastrointestinal system, skin, mouth, oral cavity, respiratory system, or reproductive system). Among these, the gut microbiome plays a crucial role in maintaining homeostasis. However, an imbalance in the microbiota (dysbiosis or loss of homeostasis) can lead to the development of gastrointestinal diseases, such as colorectal cancer (CRC).

Consequently, a significant concern exists about the potential effects of antibiotics and other medications on cancer therapy due to their bidirectional impact on the microbiota composition. This is the goal of this research, in which we are trying to understand if a potential bacteria or bacterial cluster are present or absent in our two groups of colorectal cancer, classified by chemotherapy treatment toxicity.

1.1 Cohort characteristics

Between October 2017 and April 2021, we prospectively enrolled 36 adults with histologically confirmed colorectal cancer at the University Hospital Complex of A Coruña (Spain). One faecal sample targeting the 16S rRNA V3–V4 region was collected from each patient before the first chemotherapy cycle; antibiotics and probiotics were prohibited for the preceding four weeks. All patients started a doublet of oxaliplatin (OX) plus 5-fluorouracil (5-FU, FOLFOX-like) and/or radiotherapy.

1.2 Toxicity Variable Design

For this reason, our oncologist designed the toxicity variable following the next criteria: Clinical metadata were extracted from the electronic record and merged with the microbiome profiles. The primary outcome is a dichotomous toxicity target variable: low (\(n=11\)) versus severe (\(n=25\)), defined by CTCAE v5.0 together with a \(\ge 20\,\%\) chemotherapy dose reduction or suspension of any of 5-FU and/or OX. Absence of systemic chemotherapy was the sole exclusion criterion, yielding the present cohort of 36 patients. Detailed coding rules and extended baseline characteristics are provided in Supplementary Table S1 of this paper research.

2 Research Questions

RQ1: Would there be similarities among the DAA methods outputs applying different microbiome data transformations?

RQ2: Would there be similarities among the DAA methods’ outputs with and without previous prevalence filtering criteria?

RQ3: Could it be possible to identify differential features in terms of their abundance between two toxicity groups? (The groups of study variables were defined under the criteria of oncologists).

3 Libraries

4 Data

Importing the phyloseq object:

## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 6082 taxa and 120 samples ]
## sample_data() Sample Data:       [ 120 samples by 45 sample variables ]
## tax_table()   Taxonomy Table:    [ 6082 taxa by 7 taxonomic ranks ]
## phy_tree()    Phylogenetic Tree: [ 6082 tips and 6004 internal nodes ]

All decontaminat steps were done in the previous steps of the paper (NOTE(1)), so this ps object has no contaminant included. As the derep_ps objects comes from another paper (NOTE (1)), the sample data contains some extra columns.

Loading and Cleaning Data

Since some patients had replicated samples, we aggregated their measurements using the median.

Transform relevant variables (toxicity variable and gender) into factors

5 Pre-Analysis

5.1 Prevalence filter

The first step is to evaluate the number or retained taxa applying different prevalent values (it should be present in more than 1 person). For that, we should take in mind that we have a total of 36 patients. Then, if we extract the 1’% of 100 individuals, and adjusted it to our N, the prevalence would be associated to 3–4 individuals (3.6 individuals exactly). Trying different prevalence values:

  • With 0.01–1: 3022 taxa are retained, but are present in less than one person.
  • With 3.6 and 5: 1045 taxa are retained, and are also presented in 3–4 individuals (prev=3.6) and in 1.8 individuals (prev=5).

Prevalence:5% (1045 taxa)

After testing multiple prevalence thresholds, a 5% cutoff was chosen to maintain a meaningful number of taxa (1,045) while ensuring their presence in roughly two out of 36 patients.

5.2 Some metadata stats

DataFrame summary related a selection of relevant variables

The goal of this step is perform a pre-understanding of our data and the relevant variables that could be use in further steps.

Genus Level Glom Taxa (unfiltered ps) Glom taxa to Genus level using our phyloseq object (tox_ps_final)

Plot: Toxicity associated to gender distribution

NOTE: This code was adapted from Conde-Perez et al. (2024) paper, referenced in NOTES section.

A quick by to our dataset, to see the distribution by sex and age range in our two-toxicity classes. In the following plot the subject toxicity associated density based on sex and age can be seen per Toxicity variable (TOX_SEVERA):

This plot compares the age distribution by toxicity group and by sex. In the Low_Tox panel, the sample is markedly male-skewed (2 females vs 9 males). Ages in this group are mostly older: one of the two females falls in the 50–60 range, whereas nearly all remaining observations lie between 60 and 90 years. In the Severe_Tox panel, the sex ratio is more balanced (11 females vs 14 males), and both sexes skew older, with most individuals aged ≥70 years. These counts and age ranges refer to the individuals shown in the plot.

6 Sample Justification and Power Analysis

After reviewing various methodological options, we decided to perform a GPower analysis. The GPower results supported the relevance of our findings within the cohort, highlighting that the microbiome’s significant differences observed in one of the two toxicity groups are likely due to the presence of strong abundance shifts that are detectable even with a relatively small sample size (36 patients).

Additional tools and R packages, such as micropower, were also considered. However, micropower could not be applied in our case due to its reliance on simulated data rather than empirical input.

NOTE: micropower is a specialized R package designed to estimate statistical power and sample size in microbiome studies. It helps researchers determine whether their cohort size is sufficient to detect differences in the relative abundances of bacterial taxa.

7 Results: Differential Abundance Analysis – Ensemble Benchmarking

As the next step, we performed a benchmarking of several differential abundance analysis (DAA) methods to identify microbial taxa significantly associated with one of the two study groups. Patients were categorized based on the severity of chemotherapy-related toxicity, as assessed by oncologists: those who experienced mild or manageable toxicity were assigned to the low toxicity group, while those who experienced severe toxicity and/or required dose reduction or discontinuation of one of the two treatment regimens were assigned to the severe toxicity group. More information is available in the paper.

Given the variability in statistical assumptions and performance across DAA methods, we implemented an ensemble consensus approach to increase robustness and reduce method-specific biases. This approach involved comparing the outputs of multiple DAA methods and identifying consistently detected taxa (i.e. those that appear as significantly differential abundant across all methods). These overlapping results were considered the most reliable candidates for biologically meaningful group-specific microbial signatures.

The main objective of this ensemble strategy was to highlight consensus taxa and assess the reproducibility of findings across different analytical frameworks, thereby enhancing the overall validity of our results.

Methods and Strategy

In this section, we apply a diverse set of differential abundance analysis (DAA) methods, each based on different statistical assumptions, data transformations, and normalization strategies. Despite their methodological differences, all share a common goal: to identify taxa that show significant differences in abundance based on a grouping variable.

We evaluate each method under multiple conditions:

  • With and without prevalence filtering.
  • With and without false discovery rate (FDR) control, using the Benjamini–Hochberg (BH) correction.

A distinctive aspect of our study is the clinical definition of the comparison groups. Unlike conventional case-control designs, our study focuses on two groups of CRC patients defined by chemotherapy-induced toxicity:

  • Low toxicity group: patients who experienced mild or manageable toxicity
  • Severe toxicity group: patients who experienced severe toxicity and/or required dose reduction or discontinuation of treatment

This grouping was based on clinical evaluations and treatment modifications, and reflects a real-world, treatment-driven classification. The grouping variable used for DAA is therefore both clinically meaningful and directly relevant to the outcome of interest.

7.1 (1) Without previous prevalence filtering evaluating Genus and ASV levels

In this unfiltered approach, we are using the non-filter by prevalence phyloseq object named: tox_ps_final_genus. For biomarker discover purpose in our clinical prediction, we aggregated to genus level.

7.1.1 ANCOM-BC (NO BH correction and prevalence=0)

We first applied the ANCOM-BC method (ancombc function) to identify differentially abundant taxa between the two toxicity groups. For this initial analysis, we used the following settings:

  • No prevalence filtering (prv_cut = 0)
  • No multiple testing correction (p_adj_method = "none")
  • qval set to 0.05 (qval_cut = 0.05)
  • Significance threshold: alpha = 0.05
  • Covariates included in the model: target_variables. Only used in the covariates section.

The analysis was performed at the ancom_target taxonomic level using parallel processing with 8 cores (n_cl = 8).
We used the argument conserve = TRUE to apply conservative variance estimates.

NOTE: As we want to no aggregate due previous aggregation to genus level, then ANCOM-BC target_level parameter should be set at “ASV”. See code for further details.

NOTE (2): The ancombc global function code was adapted from the paper: 16S, Conde-Pérez et al (2024). Reference in NOTE (1).

ANCOM-BC No Prevalence-NoBH Plot

The results provide an initial, unadjusted list of candidate taxa associated with toxicity status, to be compared with other DAA methods in subsequent steps.

7.1.2 ANCOM-BC (BH correction and prevalence=0)

We then applied the ANCOM-BC method (ancombc function) to identify differentially abundant taxa between the two toxicity groups, this time incorporating false discovery rate (FDR) control via Benjamini–Hochberg (BH) correction. The parameters used were:

  • No prevalence filtering (prv_cut = 0)
  • FDR correction applied (p_adj_method = "BH")
  • Significance threshold: alpha = 0.05
  • Covariates included in the model: target_variables

The analysis was conducted at the ancom_target taxonomic level using 8 processing cores (n_cl = 8) and conservative variance estimation (conserve = TRUE).

NOTE: As we want to no aggregate due previous aggregation to genus level, then ANCOM-BC target_level parameter should be set at “ASV”. See code for further details. So, we also run the main function on each taxonomic level with a prevalence cut of 0:

ANCOM-BC No Prevalence-BH Plot

These results reflect a more stringent significance threshold compared to the unadjusted analysis, allowing us to identify taxa that remain robustly associated with toxicity group after correcting for multiple comparisons.

7.1.3 ALDEx2 (Official Package)

ALDEx2 (ANOVA-Like Differential Expression) is a method designed to identify differences in feature abundance between groups in compositional data, such as those generated by microbiome sequencing (e.g., 16S rRNA or metagenomics). It works by generating multiple Monte Carlo instances from a Dirichlet distribution to account for technical variability, then applying a centered log-ratio (CLR) transformation to the data.

Statistical tests (e.g., Welch’s t-test or Wilcoxon) are performed on these transformed values to detect differential abundance. In our case, as we only have two groups to evaluate, a t-test is performed. This statistical approach returns a p-value (Welch) information (we.ep) and a q-value or correction over p-value through BH (we.eBH). These p-values and q-values come from the Monte Carlo estimation in CLR transformed abundance values. Note*

ALDEx2 also estimates effect size (log-fold change) and includes adjusted p-values (e.g., via Benjamini-Hochberg correction) for multiple testing. Its main strength lies in properly addressing the compositional nature of relative abundance data, reducing the risk of false positives compared to traditional methods.

Note*: In case you have more than two groups to evaluate, then Wilcoxon (kw) test should be tested.

## [1] TRUE
## |------------(25%)----------(50%)----------(75%)----------|
## [1] "Filter alpha 0.05"
##            corrected
## uncorrected FALSE
##       FALSE   292
##       TRUE      5
## [1] "Filter alpha 0.1"
##            corrected
## uncorrected FALSE
##       FALSE   286
##       TRUE     11

Plot: ALDEx2 No Prevalence-NoBH Plot

## ⚠️ No features with p < 0.05 y q < 0.05. So, no plot would be generated.

Filtering using a pval of 0.1 to see the retained at 0.05 and 0.01 pvalues.

In the case of ALDEx2, no features returned q values lower than 0.1 or 0.05 (filtering criteria).

7.1.4 ASV ALDEx2

## [1] TRUE
## |------------(25%)----------(50%)----------(75%)----------|
## [1] "Filter alpha 0.05"
##            corrected
## uncorrected FALSE
##       FALSE  3016
##       TRUE      6
## [1] "Filter alpha 0.1"
##            corrected
## uncorrected FALSE
##       FALSE  3010
##       TRUE     12

Plot(ASV): ALDEx2 No Prevalence-NoBH

## ⚠️ No features with p < 0.05 y q < 0.05. So, no plot would be generated.

Filtering using a pval of 0.1 to see the retained at 0.05 and 0.01 pvalues.

7.1.5 DESeq2 no BH and BH

In accordance with DESeq2 authors’ recommendations, we applied a pre-filtering step to remove features with very low counts, particularly given our relatively small sample size (i.e., ≤ 10 samples in the smallest group). This filtering helps reduce memory usage, improves computational efficiency, and enhances the interpretability of downstream analyses such as PCA and dispersion plots by removing uninformative features.

We employed the Wald test for differential expression analysis, as our experimental design involves a binary comparison (two conditions in the TOX_SEVERA variable). The Likelihood Ratio Test (LRT) is more appropriate for more complex models or multi-factor designs, and thus was not used here.

Finally, we compared raw p-values with those adjusted using the Benjamini–Hochberg (BH) method for False Discovery Rate (FDR) control. This comparison allows us to assess the number of features filtered out by the multiple testing correction process, and to better understand the trade-off between sensitivity and specificity in identifying significant genes or taxa.

features with pvalue (raw) < 0.05 and < 0.1 to see the difference

7.1.6 ASV

features with pvalue (raw) < 0.05 and < 0.1 to see the difference

7.1.7 LEfSe

LEfSe (Linear Discriminant Analysis Effect Size) is a statistical method designed to identify features (e.g., taxa) that are both statistically significant and biologically relevant across predefined groups. It combines non-parametric Kruskal–Wallis and Wilcoxon rank-sum tests with Linear Discriminant Analysis (LDA) to estimate the effect size of each differentially abundant feature.

In this analysis, we applied LEfSe using both CPM-transformed counts and relative abundances, as recommended in the literature. Since LEfSe does not implement any multiple testing correction by default, Benjamini–Hochberg (BH) correction was applied manually to the resulting p-values to allow fair comparison with other differential abundance analysis (DAA) methods.

7.1.7.1 (CPM) LEfSe groups previous sum of 1e-06

We applied the LEfSe method to identify differential abundant taxa between the two toxicity groups, using CPM (counts per million) normalized data. The parameters were settled to 0.05 in Kruskal-Wallis and Wilcoxon tests.

The class variable used to define the groups was toxicity level (low vs. severe). LEfSe was then applied to detect taxa that are not only statistically significant, but also biologically consistent across the groups, using a non-parametric approach followed by linear discriminant analysis (LDA).

NOTE: To ensure numerical stability during the LEfSe analysis, a small pseudocount (1e-06) was added to all abundance values prior to normalization. This step is essential because LEfSe applies logarithmic transformations during its processing pipeline (e.g., after CPM or relative abundance normalization), and zero values can lead to undefined or unstable results when log-transformed. The pseudocount is small enough to have negligible impact on relative differences but prevents errors and maintains consistency across features during differential abundance testing and LDA effect size estimation.

7.1.7.1.1 ASV Level

7.1.7.2 (RELAB) LEfSe groups previous sum of 1e-06

We also applied the LEfSe method to identify differential abundant taxa between the two toxicity groups, using Relative Abundances normalized data. The parameters were settled to 0.05 in Kruskal-Wallis and Wilcoxon tests.

The class variable used to define the groups was toxicity level (low vs. severe). LEfSe was then applied to detect taxa that are not only statistically significant, but also biologically consistent across the groups, using a non-parametric approach followed by linear discriminant analysis (LDA).

NOTE: To ensure numerical stability during the LEfSe analysis, a small pseudocount (1e-06) was added to all abundance values prior to normalization. This step is essential because LEfSe applies logarithmic transformations during its processing pipeline (e.g., after CPM or relative abundance normalization), and zero values can lead to undefined or unstable results when log-transformed. The pseudocount is small enough to have negligible impact on relative differences but prevents errors and maintains consistency across features during differential abundance testing and LDA effect size estimation.

7.1.7.2.1 ASV Level

7.1.8 LINDA

LinDA (Linear Models for Differential Abundance) is a statistical method designed to detect differentially abundant microbial taxa between experimental groups in compositional microbiome data. It fits a linear model for each taxon, adjusting for compositionality and potential covariates (e.g., confounders like age or sex).

Output Structure:

LINDA returns a list of data frames, each corresponding to a tested variable (TOX_SEVERA in our case). The reference group of the tested variable is the FIRST one that appears when you launch the command levels(sample_data(tox_ps_final_genus)$TOX_SEVERA). Each row represents a taxon and includes:

  • log2FoldChange: The estimated effect size (positive values indicate higher abundance in the reference group, negative values in the compared group).

  • padj: Adjusted p-value using methods like Benjamini-Hochberg.

  • reject: Logical indicator (TRUE/FALSE) denoting whether the taxon is significantly differentially abundant (based on padj threshold).

Additional fields: baseMean, standard error (lfcSE), test statistic, and degrees of freedom.

Interpretation:

  • **Significance: Taxa with reject == TRUE and low padj (typically < 0.05) are considered significantly differentially abundant.

Direction of Change:

  • log2FoldChange > 0: More abundant in the reference group (low in our case).

  • log2FoldChange < 0: More abundant in the comparison group.

Magnitude: Larger absolute values of log2FoldChange suggest stronger differential abundance.

Confounders: When present, LINDA provides separate outputs per covariate, allowing assessment of each variable’s effect independently.

LINDA is particularly suited for microbiome data due to its compositional adjustments and flexible handling of multiple covariates. More information able in the official tutorial: tutorial https://github.com/zhouhj1994/LinDA

As LinDA shows a strict criteria, we are showing and retaining the features that shows at least qval < 0.1.

7.1.8.1 ASV Level

As LinDA shows a strict criteria, we are showing and retaining the features that shows at least qval < 0.1.

7.1.9 ZicoSeq

ZicoSeq is a linear model– and permutation-based framework for differential abundance analysis designed for zero-inflated, compositional sequencing data. It accepts raw count or proportion data, along with associated metadata (e.g., grouping and covariates), and is particularly robust to common microbiome data challenges such as sparsity, outliers, and variability in sampling depth.

The method begins by filtering low-abundance and low-prevalence features. It then applies winsorization to limit the influence of outliers, and—when using count data—may incorporate Bayesian smoothing via a beta-mixture model to account for sampling variability and zero inflation.

Normalization is conducted using a reference-based iterative procedure, which identifies a stable set of features by excluding highly variable taxa. Differential abundance is then assessed using a linear modeling approach over multiple data transformations (e.g., log, square-root), enabling flexibility in capturing different types of feature–group relationships.

Statistical significance is determined using a permutation-based omnibus test, which preserves the correlation structure of the data. ZicoSeq reports both raw and adjusted p-values (FDR, and optionally FWER), along with effect size measures such as R² and signed test statistics, supporting robust biomarker discovery in compositional datasets.

## [1] "matrix" "array"
## For proportion and other data types,  posterior sampling will not be performed!
## The data has  36  samples and  144  features will be tested!
## On average,  1  outlier counts will be replaced for each feature!
## Permutation testing ...
## .........
## Completed!

7.1.9.1 ZicoSeq Plot

In the case of ZicoSeq, R² values were used as proxies for effect sizes (logFC) for the purpose of comparison across methods.

7.1.9.2 ASV

## [1] "matrix" "array"
## For proportion and other data types,  posterior sampling will not be performed!
## The data has  36  samples and  224  features will be tested!
## On average,  1  outlier counts will be replaced for each feature!
## Permutation testing ...
## .........
## Completed!

## Warning: No shared levels found between `names(values)` of the manual scale and the
## data's fill values.
## Warning: No shared levels found between `names(values)` of the manual scale and the
## data's alpha values.

7.2 (2) Filtering by prevalence directly in phyloseq object

Here, we are using the phyloseq filtered by prevalence object, named physeq_filtered. We also aggregated to genus level for biomarker discover purpose and named it physeq_filtered_genus:

7.2.1 ANCOM-BC (Prevalence approach, NO BH correction)

NOTE: As we want to no aggregate due previous aggregation to genus level, then ANCOM-BC target_level parameter should be set at “ASV”. See code for further details.

Filtering by qval

7.2.1.1 ANCOM-BC Prevalence-NoBH Plot

Final ANCOM-BC df results:

7.2.2 ANCOM-BC (Prevalence approach, BH correction)

ANCOM-BC package are used in differential abundance analysis.

Different target_level have been used.

Default p-value correction Holm changed by Benjamini-Hochberg.

NOTE: As we want to no aggregate due previous aggregation to genus level, then ANCOM-BC target_level parameter should be set at “ASV”. See code for further details.

7.2.2.1 ANCOM-BC Prevalence-BH Plot

7.2.3 ALDEx2 (Official Package)

ALDEx2 (ANOVA-Like Differential Expression) is a method designed to identify differences in feature abundance between groups in compositional data, such as those generated by microbiome sequencing (e.g., 16S rRNA or metagenomics). It works by generating multiple Monte Carlo instances from a Dirichlet distribution to account for technical variability, then applying a centered log-ratio (CLR) transformation to the data. Statistical tests (e.g., Welch’s t-test or Wilcoxon) are performed on these transformed values to detect differential abundance. ALDEx2 also estimates effect size (log-fold change) and includes adjusted p-values (e.g., via Benjamini-Hochberg correction) for multiple testing. Its main strength lies in properly addressing the compositional nature of relative abundance data, reducing the risk of false positives compared to traditional methods.

## [1] TRUE
## |------------(25%)----------(50%)----------(75%)----------|
## [1] "Filter alpha 0.05"
##            corrected
## uncorrected FALSE
##       FALSE   205
##       TRUE      7
## [1] "Filter alpha 0.1"
##            corrected
## uncorrected FALSE
##       FALSE   202
##       TRUE     10

7.2.3.1 ALDEx2 Plot

## ⚠️ No features with p < 0.05 y q < 0.05. So, no plot would be generated.

7.2.3.2 ASV ALDEx2

## [1] TRUE
## |------------(25%)----------(50%)----------(75%)----------|
## [1] "Filter alpha 0.05"
##            corrected
## uncorrected FALSE
##       FALSE  1040
##       TRUE      5
## [1] "Filter alpha 0.1"
##            corrected
## uncorrected FALSE
##       FALSE  1034
##       TRUE     11

7.2.3.3 ALDEx2 Plot

## ⚠️ No features with p < 0.05 y q < 0.05. So, no plot would be generated.

7.2.4 DESeq2

As previously described in the analysis without prevalence filtering, DESeq2 models were applied following the authors’ recommendations for pre-filtering low-count features, particularly in cases with small sample sizes (i.e., ≤ 10 samples in the smallest group). This filtering step reduces the size of the dataset, improves computational efficiency, and enhances the clarity of downstream analyses such as PCA and dispersion plots by removing features with minimal biological signal.

In the current approach, we additionally applied a prevalence filter to retain only those features that are present in a minimum proportion of samples across the dataset. This step aims to further reduce noise and focus the analysis on features that are consistently detected, thereby potentially improving the robustness and interpretability of differential abundance results.

## 
## non-significant     significant 
##              82               1

Filtering data by pvalue (0.05) conditions

## # A tibble: 5 × 3
##   Genus                            max_lfc     n
##   <chr>                              <dbl> <int>
## 1 g__Lachnospiraceae_NK4A136_group    2.36     1
## 2 g__Fusicatenibacter                 2.30     1
## 3 g__Lachnospira                      2.19     1
## 4 g__[Ruminococcus]_torques_group    -1.07     1
## 5 g__Butyricicoccus                  -1.82     1
##                                   baseMean log2FoldChange     lfcSE      stat
## da8b26f82eb70e299518e149ae85f3d9  97.87732      -1.821848 0.7594865 -2.398789
## e655845f5f4ce1633524c0c9a0b15927  89.36681       2.192742 1.0341139  2.120407
## 93b58b0ba0d326e9c8d1a81f8672c16a 393.69020      -1.071669 0.5396934 -1.985700
## 707940842caa2afe60491008e04a8173 287.26363       2.357575 0.6388942  3.690087
## 9df251784dde31e05f02b2ee1029d71c 153.34220       2.304067 0.8033413  2.868105
##                                        pvalue       padj
## da8b26f82eb70e299518e149ae85f3d9 0.0164493961 0.45509996
## e655845f5f4ce1633524c0c9a0b15927 0.0339717478 0.70491377
## 93b58b0ba0d326e9c8d1a81f8672c16a 0.0470665962 0.73254510
## 707940842caa2afe60491008e04a8173 0.0002241775 0.01860673
## 9df251784dde31e05f02b2ee1029d71c 0.0041293849 0.17136947
##                                                             Genus
## da8b26f82eb70e299518e149ae85f3d9                g__Butyricicoccus
## e655845f5f4ce1633524c0c9a0b15927                   g__Lachnospira
## 93b58b0ba0d326e9c8d1a81f8672c16a  g__[Ruminococcus]_torques_group
## 707940842caa2afe60491008e04a8173 g__Lachnospiraceae_NK4A136_group
## 9df251784dde31e05f02b2ee1029d71c              g__Fusicatenibacter
Final DESeq2 results:

7.2.4.1 ASV

## 
## non-significant     significant 
##             115               1

## # A tibble: 6 × 3
##   Genus                            max_lfc     n
##   <chr>                              <dbl> <int>
## 1 g__Lachnospira                      7.62     1
## 2 g__Streptococcus                    3.25     1
## 3 g__Bacteroides                      2.93     1
## 4 g__Lachnospiraceae_NK4A136_group    2.68     1
## 5 g__Fusicatenibacter                 2.33     1
## 6 g__Blautia                          1.65     1
##                                  baseMean log2FoldChange     lfcSE     stat
## 9df251784dde31e05f02b2ee1029d71c 153.4869       2.325699 0.8395547 2.770159
##                                       pvalue      padj               Genus
## 9df251784dde31e05f02b2ee1029d71c 0.005602901 0.3249683 g__Fusicatenibacter

Final DESeq2 results:

7.2.5 LEfSe

This analysis follows the same statistical approach as the previous LEfSe section, but includes an additional prevalence filtering step, retaining only features present in a minimum proportion of samples. This aims to reduce the influence of rare, low-information features and improve robustness in identifying biologically meaningful differences between groups.

As before, both CPM and relative abundance transformations were used, and BH-adjusted p-values were manually calculated to enable direct comparison with other methods that apply multiple testing correction.

7.2.5.1 (CPM) LEfSe groups previous sum of 1e-06

NOTE: To ensure numerical stability during the LEfSe analysis, a small pseudocount (1e-06) was added to all abundance values prior to normalization. This step is essential because LEfSe applies logarithmic transformations during its processing pipeline (e.g., after CPM or relative abundance normalization), and zero values can lead to undefined or unstable results when log-transformed. The pseudocount is small enough to have negligible impact on relative differences but prevents errors and maintains consistency across features during differential abundance testing and LDA effect size estimation.

7.2.5.1.1 ASV Level

7.2.5.2 (RELAB) LEfSe groups previous sum of 1e-06

NOTE: To ensure numerical stability during the LEfSe analysis, a small pseudocount (1e-06) was added to all abundance values prior to normalization. This step is essential because LEfSe applies logarithmic transformations during its processing pipeline (e.g., after CPM or relative abundance normalization), and zero values can lead to undefined or unstable results when log-transformed. The pseudocount is small enough to have negligible impact on relative differences but prevents errors and maintains consistency across features during differential abundance testing and LDA effect size estimation.

## [1] "BH correction  processing..."
7.2.5.2.1 ASV LEVEL

## [1] "BH correction  processing..."

7.2.6 LinDA (TOX_SEVERA with Prevalence Filtering)

As described in the previous section, LinDA applies a linear modeling approach to compositional microbiome data using CLR transformation and permutation-based inference. In this analysis, we additionally applied a prevalence filter prior to modeling, retaining only features present in a minimum proportion of samples. This step aims to reduce noise from rare features and focus the differential abundance analysis on more consistently detected taxa.

As before, both raw and Benjamini–Hochberg (BH) adjusted p-values were calculated to assess the effect of multiple testing correction on feature selection.

## Pseudo-count approach is used.
## $plot.lfc
## $plot.lfc[[1]]

## 
## 
## $plot.volcano
## $plot.volcano[[1]]

Without BH correction

With BH correction As LinDA shows a strict criteria, we are showing and retaining the features that shows at least qval < 0.1.

## Warning: No shared levels found between `names(values)` of the manual scale and the
## data's alpha values.
## No shared levels found between `names(values)` of the manual scale and the
## data's alpha values.
## No shared levels found between `names(values)` of the manual scale and the
## data's alpha values.
## No shared levels found between `names(values)` of the manual scale and the
## data's alpha values.

7.2.6.1 ASV LEVEL

## Pseudo-count approach is used.
## $plot.lfc
## $plot.lfc[[1]]

## 
## 
## $plot.volcano
## $plot.volcano[[1]]

Without BH correction

With BH correction As LinDA shows a strict criteria, we are showing and retaining the features that shows at least qval < 0.1.

## Warning: No shared levels found between `names(values)` of the manual scale and the
## data's alpha values.
## No shared levels found between `names(values)` of the manual scale and the
## data's alpha values.
## No shared levels found between `names(values)` of the manual scale and the
## data's alpha values.
## No shared levels found between `names(values)` of the manual scale and the
## data's alpha values.

7.2.7 ZicoSeq

As previously described, ZicoSeq is a linear model– and permutation-based method designed for zero-inflated compositional data. In this section, we applied an additional prevalence filtering step prior to analysis, retaining only features detected in a minimum proportion of samples. This filtering helps reduce the influence of rare features and improves the stability of the reference-based normalization and permutation testing procedures.

Raw and permutation-adjusted p-values (FDR) were again used to evaluate the impact of multiple testing correction on the detection of differentially abundant features.

## [1] "matrix" "array"
## For proportion and other data types,  posterior sampling will not be performed!
## The data has  36  samples and  128  features will be tested!
## On average,  1  outlier counts will be replaced for each feature!
## Permutation testing ...
## .........
## Completed!

7.2.7.1 ZicoSeq Plot

7.2.7.2 ASV

## [1] "matrix" "array"
## For proportion and other data types,  posterior sampling will not be performed!
## The data has  36  samples and  224  features will be tested!
## On average,  1  outlier counts will be replaced for each feature!
## Permutation testing ...
## .........
## Completed!

7.2.7.3 ZicoSeq Plot

## Warning: No shared levels found between `names(values)` of the manual scale and the
## data's fill values.
## Warning: No shared levels found between `names(values)` of the manual scale and the
## data's alpha values.

8 Evaluating common and uncommon bacteria

In this section we are detecting and visualizing the bacteria that are common between some (or all) the methods, as well as the unique biomarker detected by each method + approach.

## $res_aldex2_noprev_ASV_noBH_df
##                                         pval      qval        wi.ep    wi.eBH
## 15e05255b2aa8ee3524ca61eb207bb18 0.081306247 1.0000000 0.3186740038 1.0000000
## 44cae38ace30f54ee81cc5c5b2ce47eb 0.083697892 0.9773039 0.0474647663 0.9897809
## 4d6fe682a4dd9aad8decfab830a193e0 0.091213372 0.9993405 0.0812125488 1.0000000
## 8f3c8cb8544640becafb5777eb7a5858 0.098661800 0.9922256 0.0709048977 0.9895994
## 582fae36f33acd1efdcaf7cacf00ef0a 0.050969056 1.0000000 0.0485136981 1.0000000
## f664a73e7de6e00ff70e013369499cbd 0.021406655 0.9883441 0.0514005485 0.9952138
## af2150fef76b295092cd60a788cbe500 0.047749749 1.0000000 0.0035463241 0.9893937
## 25496a64517e79fae616627d6927854e 0.063076660 1.0000000 0.0566321001 0.9904108
## 25af29e1b2d121f8aae468d270d75518 0.028208053 1.0000000 0.0393166041 0.9969398
## cc465b18f37cb7e609cff5ba5ed2bffe 0.046345362 0.9809461 0.0323832465 0.9884264
## 9df251784dde31e05f02b2ee1029d71c 0.004430293 1.0000000 0.0008328149 0.8833835
## be1fe9bfe424ad27f568d8e2d7b42c19 0.046123476 1.0000000 0.0202594433 0.9968492
##                                    rab.all  rab.win.0  rab.win.1      logFC
## 15e05255b2aa8ee3524ca61eb207bb18 10.618443 10.7822993 10.5448272 -0.5528752
## 44cae38ace30f54ee81cc5c5b2ce47eb  1.843180  7.1178326  1.0823890 -3.9416592
## 4d6fe682a4dd9aad8decfab830a193e0  8.621118  9.9432678  8.3560278 -2.0688815
## 8f3c8cb8544640becafb5777eb7a5858  1.494705  6.7436834  0.9106689 -3.9405175
## 582fae36f33acd1efdcaf7cacf00ef0a  6.369136  1.1387618  7.2385531  4.3958509
## f664a73e7de6e00ff70e013369499cbd  1.653785  0.1804158  3.2609676  4.6571910
## af2150fef76b295092cd60a788cbe500  8.180591  6.1875508  8.5973346  2.6136809
## 25496a64517e79fae616627d6927854e  2.114770  0.4038674  4.3913562  3.7245176
## 25af29e1b2d121f8aae468d270d75518  5.421492  0.7303390  8.6733518  6.4149225
## cc465b18f37cb7e609cff5ba5ed2bffe  1.644493  7.0821186  0.8433434 -4.9008214
## 9df251784dde31e05f02b2ee1029d71c  8.302308  2.8409041  8.6772412  5.6695325
## be1fe9bfe424ad27f568d8e2d7b42c19  7.075798  2.1943857  7.7086057  3.8119329
##                                  diff.win     effect  effect.low effect.high
## 15e05255b2aa8ee3524ca61eb207bb18 2.378879 -0.1704568 -10.3155817    2.684609
## 44cae38ace30f54ee81cc5c5b2ce47eb 6.405141 -0.5539998  -7.9462601    2.337585
## 4d6fe682a4dd9aad8decfab830a193e0 5.514116 -0.3273025  -7.4253697    3.409693
## 8f3c8cb8544640becafb5777eb7a5858 7.260082 -0.5170380  -6.4659419    2.598227
## 582fae36f33acd1efdcaf7cacf00ef0a 6.951487  0.6009216  -2.3392843    8.256984
## f664a73e7de6e00ff70e013369499cbd 6.238730  0.6496785  -2.0888978    6.012369
## af2150fef76b295092cd60a788cbe500 4.647590  0.5606651  -2.4723960   11.856755
## 25496a64517e79fae616627d6927854e 5.376925  0.6091681  -2.2631984    6.034096
## 25af29e1b2d121f8aae468d270d75518 8.393566  0.6784476  -2.8121496    7.596141
## cc465b18f37cb7e609cff5ba5ed2bffe 7.224294 -0.6078109  -7.1714325    2.091439
## 9df251784dde31e05f02b2ee1029d71c 4.956521  0.9096101  -0.8369337   10.392921
## be1fe9bfe424ad27f568d8e2d7b42c19 6.791876  0.5509743  -2.2798787    6.019764
##                                    overlap                          bacteria
## 15e05255b2aa8ee3524ca61eb207bb18 0.3929078                    g__Collinsella
## 44cae38ace30f54ee81cc5c5b2ce47eb 0.2780142                      g__Alistipes
## 4d6fe682a4dd9aad8decfab830a193e0 0.3205674                g__Parabacteroides
## 8f3c8cb8544640becafb5777eb7a5858 0.2911932                     g__Parvimonas
## 582fae36f33acd1efdcaf7cacf00ef0a 0.2769887 g__[Eubacterium]_ventriosum_group
## f664a73e7de6e00ff70e013369499cbd 0.2581561                    g__Lachnospira
## af2150fef76b295092cd60a788cbe500 0.2113476                        g__Blautia
## 25496a64517e79fae616627d6927854e 0.2514205             f__Lachnospiraceae_NA
## 25af29e1b2d121f8aae468d270d75518 0.2567376                      g__Roseburia
## cc465b18f37cb7e609cff5ba5ed2bffe 0.2642046               g__Negativibacillus
## 9df251784dde31e05f02b2ee1029d71c 0.1633524               g__Fusicatenibacter
## be1fe9bfe424ad27f568d8e2d7b42c19 0.2471591  g__Lachnospiraceae_NK4A136_group
##                                  enrich_group    direction orientation qval.txt
## 15e05255b2aa8ee3524ca61eb207bb18            0 Negative LFC          -1        -
## 44cae38ace30f54ee81cc5c5b2ce47eb            0 Negative LFC          -1        -
## 4d6fe682a4dd9aad8decfab830a193e0            0 Negative LFC          -1        -
## 8f3c8cb8544640becafb5777eb7a5858            0 Negative LFC          -1        -
## 582fae36f33acd1efdcaf7cacf00ef0a            1 Positive LFC           1        -
## f664a73e7de6e00ff70e013369499cbd            1 Positive LFC           1        -
## af2150fef76b295092cd60a788cbe500            1 Positive LFC           1        -
## 25496a64517e79fae616627d6927854e            1 Positive LFC           1        -
## 25af29e1b2d121f8aae468d270d75518            1 Positive LFC           1        -
## cc465b18f37cb7e609cff5ba5ed2bffe            0 Negative LFC          -1        -
## 9df251784dde31e05f02b2ee1029d71c            1 Positive LFC           1        -
## be1fe9bfe424ad27f568d8e2d7b42c19            1 Positive LFC           1        -
##                                  prev BH herramienta
## 15e05255b2aa8ee3524ca61eb207bb18   no no      ALDEX2
## 44cae38ace30f54ee81cc5c5b2ce47eb   no no      ALDEX2
## 4d6fe682a4dd9aad8decfab830a193e0   no no      ALDEX2
## 8f3c8cb8544640becafb5777eb7a5858   no no      ALDEX2
## 582fae36f33acd1efdcaf7cacf00ef0a   no no      ALDEX2
## f664a73e7de6e00ff70e013369499cbd   no no      ALDEX2
## af2150fef76b295092cd60a788cbe500   no no      ALDEX2
## 25496a64517e79fae616627d6927854e   no no      ALDEX2
## 25af29e1b2d121f8aae468d270d75518   no no      ALDEX2
## cc465b18f37cb7e609cff5ba5ed2bffe   no no      ALDEX2
## 9df251784dde31e05f02b2ee1029d71c   no no      ALDEX2
## be1fe9bfe424ad27f568d8e2d7b42c19   no no      ALDEX2
## 
## $res_aldex2_noprev_noBH_df
##                                         pval      qval       wi.ep    wi.eBH
## b0553a7cf3e72573824b0dacf2747ae5 0.007161728 0.5958931 0.013333459 0.7732358
## 1c441acb05af7d8fee64aaf52bf3d223 0.061945731 0.7231110 0.026177356 0.7533942
## 8f3c8cb8544640becafb5777eb7a5858 0.047103718 0.7607209 0.017084796 0.7077159
## 1d2b00ff8b7477d2b6c2b8043f7c9d31 0.092539465 0.8758061 0.075854346 0.8645053
## 114d0aefa07e7ab7836011f44281b737 0.092350848 0.8284039 0.053541246 0.8288619
## 582fae36f33acd1efdcaf7cacf00ef0a 0.016701328 0.9773422 0.009813534 0.7933836
## e655845f5f4ce1633524c0c9a0b15927 0.059441346 0.9988693 0.020032285 0.9080566
## 93b58b0ba0d326e9c8d1a81f8672c16a 0.050689141 0.9764837 0.032680282 0.9230317
## 707940842caa2afe60491008e04a8173 0.055272522 1.0000000 0.025753267 0.9528603
## d16a4faef202f2f76497bdd5a6f454b7 0.049264558 0.6690419 0.020407300 0.7942940
## 9df251784dde31e05f02b2ee1029d71c 0.005184355 0.9050780 0.001301196 0.3453439
##                                     rab.all rab.win.0  rab.win.1     logFC
## b0553a7cf3e72573824b0dacf2747ae5  5.9670568 7.0644208  5.4480364 -1.310869
## 1c441acb05af7d8fee64aaf52bf3d223 -0.6284864 2.7517308 -1.4363635 -3.726530
## 8f3c8cb8544640becafb5777eb7a5858  0.4496205 5.0066903 -0.5332208 -4.492224
## 1d2b00ff8b7477d2b6c2b8043f7c9d31 -0.7502486 3.1082902 -1.3761178 -4.095487
## 114d0aefa07e7ab7836011f44281b737 -0.4186547 3.2534734 -1.1180664 -3.755543
## 582fae36f33acd1efdcaf7cacf00ef0a  5.3371801 1.1157413  6.2603148  3.775015
## e655845f5f4ce1633524c0c9a0b15927  5.2218610 2.7266055  5.8314296  3.135938
## 93b58b0ba0d326e9c8d1a81f8672c16a  8.4697640 9.0942406  8.2406981 -1.243068
## 707940842caa2afe60491008e04a8173  7.2110695 5.6607529  7.7782432  2.202868
## d16a4faef202f2f76497bdd5a6f454b7  1.8246068 3.3260163  0.5232515 -2.799724
## 9df251784dde31e05f02b2ee1029d71c  6.1513335 0.8069237  6.5500678  5.624240
##                                  diff.win     effect  effect.low effect.high
## b0553a7cf3e72573824b0dacf2747ae5 1.995690 -0.5055607 -10.6558175    1.538845
## 1c441acb05af7d8fee64aaf52bf3d223 5.638639 -0.5839081  -5.9911102    2.859307
## 8f3c8cb8544640becafb5777eb7a5858 6.460318 -0.6829475  -5.4795421    2.068761
## 1d2b00ff8b7477d2b6c2b8043f7c9d31 7.177791 -0.5436808  -5.7309843    1.940185
## 114d0aefa07e7ab7836011f44281b737 5.947602 -0.5278794  -5.7334893    2.115891
## 582fae36f33acd1efdcaf7cacf00ef0a 5.290303  0.6652220  -1.7773935    9.870818
## e655845f5f4ce1633524c0c9a0b15927 5.985813  0.4862400  -2.6690891    7.035152
## 93b58b0ba0d326e9c8d1a81f8672c16a 2.477538 -0.4659539  -6.9315649    2.852879
## 707940842caa2afe60491008e04a8173 3.999880  0.4333537  -1.9164662    9.992439
## d16a4faef202f2f76497bdd5a6f454b7 4.192407 -0.5967869  -8.2312674    2.189089
## 9df251784dde31e05f02b2ee1029d71c 5.101456  0.8958155  -0.8476929   11.615676
##                                    overlap                          bacteria
## b0553a7cf3e72573824b0dacf2747ae5 0.2542614       g__Family_XIII_AD3011_group
## 1c441acb05af7d8fee64aaf52bf3d223 0.2482270    g__[Eubacterium]_nodatum_group
## 8f3c8cb8544640becafb5777eb7a5858 0.2397164                     g__Parvimonas
## 1d2b00ff8b7477d2b6c2b8043f7c9d31 0.2780142             g__Peptostreptococcus
## 114d0aefa07e7ab7836011f44281b737 0.2784091                  g__Solobacterium
## 582fae36f33acd1efdcaf7cacf00ef0a 0.2241135 g__[Eubacterium]_ventriosum_group
## e655845f5f4ce1633524c0c9a0b15927 0.2556819                    g__Lachnospira
## 93b58b0ba0d326e9c8d1a81f8672c16a 0.2751774   g__[Ruminococcus]_torques_group
## 707940842caa2afe60491008e04a8173 0.2627841  g__Lachnospiraceae_NK4A136_group
## d16a4faef202f2f76497bdd5a6f454b7 0.2368795             o__Oscillospirales_NA
## 9df251784dde31e05f02b2ee1029d71c 0.1730497               g__Fusicatenibacter
##                                  enrich_group    direction orientation qval.txt
## b0553a7cf3e72573824b0dacf2747ae5            0 Negative LFC          -1        -
## 1c441acb05af7d8fee64aaf52bf3d223            0 Negative LFC          -1        -
## 8f3c8cb8544640becafb5777eb7a5858            0 Negative LFC          -1        -
## 1d2b00ff8b7477d2b6c2b8043f7c9d31            0 Negative LFC          -1        -
## 114d0aefa07e7ab7836011f44281b737            0 Negative LFC          -1        -
## 582fae36f33acd1efdcaf7cacf00ef0a            1 Positive LFC           1        -
## e655845f5f4ce1633524c0c9a0b15927            1 Positive LFC           1        -
## 93b58b0ba0d326e9c8d1a81f8672c16a            0 Negative LFC          -1        -
## 707940842caa2afe60491008e04a8173            1 Positive LFC           1        -
## d16a4faef202f2f76497bdd5a6f454b7            0 Negative LFC          -1        -
## 9df251784dde31e05f02b2ee1029d71c            1 Positive LFC           1        -
##                                  prev BH herramienta
## b0553a7cf3e72573824b0dacf2747ae5   no no      ALDEX2
## 1c441acb05af7d8fee64aaf52bf3d223   no no      ALDEX2
## 8f3c8cb8544640becafb5777eb7a5858   no no      ALDEX2
## 1d2b00ff8b7477d2b6c2b8043f7c9d31   no no      ALDEX2
## 114d0aefa07e7ab7836011f44281b737   no no      ALDEX2
## 582fae36f33acd1efdcaf7cacf00ef0a   no no      ALDEX2
## e655845f5f4ce1633524c0c9a0b15927   no no      ALDEX2
## 93b58b0ba0d326e9c8d1a81f8672c16a   no no      ALDEX2
## 707940842caa2afe60491008e04a8173   no no      ALDEX2
## d16a4faef202f2f76497bdd5a6f454b7   no no      ALDEX2
## 9df251784dde31e05f02b2ee1029d71c   no no      ALDEX2
## 
## $res_aldex2_prev_ASV_noBH_df
##                            feature        pval      qval       wi.ep    wi.eBH
## 1 f664a73e7de6e00ff70e013369499cbd 0.020541545 0.8800437 0.050476132 0.9278201
## 2 af2150fef76b295092cd60a788cbe500 0.040464692 1.0000000 0.003428486 0.8575446
## 3 25af29e1b2d121f8aae468d270d75518 0.026012082 0.9991868 0.033069929 0.9735177
## 4 9df251784dde31e05f02b2ee1029d71c 0.004734641 0.9865233 0.000780522 0.5739140
## 5 be1fe9bfe424ad27f568d8e2d7b42c19 0.047431868 1.0000000 0.020298666 0.9778105
##     rab.all  rab.win.0 rab.win.1    logFC diff.win    effect effect.low
## 1 0.8796375 -0.6828906  2.690379 4.371759 6.193586 0.6621493  -2.181100
## 2 7.3793704  5.3620220  7.889656 2.710801 4.502323 0.5448366  -2.675135
## 3 5.0039640  0.2428290  7.853464 6.319063 8.218281 0.7023558  -2.766156
## 4 7.5205554  2.0614681  7.855457 5.763563 5.030301 0.8873925  -0.700537
## 5 6.3887587  1.4669009  6.884206 3.444645 6.861979 0.4985438  -2.905656
##   effect.high   overlap                         bacteria enrich_group
## 1    6.786999 0.2627841                   g__Lachnospira            1
## 2    9.547727 0.1858157                       g__Blautia            1
## 3    7.167203 0.2684660                     g__Roseburia            1
## 4   10.321697 0.1631207              g__Fusicatenibacter            1
## 5    7.158569 0.2709220 g__Lachnospiraceae_NK4A136_group            1
##      direction orientation qval.txt prev BH herramienta
## 1 Positive LFC           1        -   no no      ALDEX2
## 2 Positive LFC           1        -   no no      ALDEX2
## 3 Positive LFC           1        -   no no      ALDEX2
## 4 Positive LFC           1        -   no no      ALDEX2
## 5 Positive LFC           1        -   no no      ALDEX2
## 
## $res_aldex2_prev_noBH_df
##                            feature        pval      qval        wi.ep    wi.eBH
## 1 b0553a7cf3e72573824b0dacf2747ae5 0.008519525 0.5478539 0.0335363983 0.8630563
## 2 8f3c8cb8544640becafb5777eb7a5858 0.041246470 0.6191173 0.0159794028 0.5966760
## 3 582fae36f33acd1efdcaf7cacf00ef0a 0.018859728 0.9182940 0.0091581720 0.5668209
## 4 e655845f5f4ce1633524c0c9a0b15927 0.033112543 0.9345265 0.0124168816 0.6119144
## 5 707940842caa2afe60491008e04a8173 0.049927956 0.9994508 0.0167886267 0.7238755
## 6 d16a4faef202f2f76497bdd5a6f454b7 0.036701359 0.5461829 0.0177108014 0.7077479
## 7 9df251784dde31e05f02b2ee1029d71c 0.004806497 0.7188198 0.0009108144 0.1776147
##      rab.all rab.win.0  rab.win.1     logFC diff.win     effect  effect.low
## 1  5.1278786 6.2903223  4.6345668 -1.388413 2.599730 -0.4216156 -10.3205407
## 2 -0.6123744 4.2268388 -1.5903272 -4.827950 6.561867 -0.7033573  -6.8561444
## 3  4.5468408 0.4635261  5.5472130  3.705607 5.679814  0.6579187  -1.9683621
## 4  4.5922986 0.6067977  5.0630297  3.936184 6.131197  0.5976115  -2.7934115
## 5  6.4812773 5.2569354  7.2728346  2.298591 4.017778  0.4983860  -2.1579691
## 6  0.7669586 2.5113131 -0.6537009 -2.770942 4.410486 -0.5739042  -7.9253388
## 7  5.4863544 0.1430596  5.9220551  5.532146 5.206290  0.8894627  -0.7968797
##   effect.high   overlap                          bacteria enrich_group
## 1    1.540806 0.2723405       g__Family_XIII_AD3011_group            0
## 2    1.938861 0.2241135                     g__Parvimonas            0
## 3    8.670267 0.2255320 g__[Eubacterium]_ventriosum_group            1
## 4    8.837581 0.2372160                    g__Lachnospira            1
## 5    8.674798 0.2471591  g__Lachnospiraceae_NK4A136_group            1
## 6    2.570206 0.2453901             o__Oscillospirales_NA            0
## 7    9.299084 0.1590910               g__Fusicatenibacter            1
##      direction orientation qval.txt prev BH herramienta
## 1 Negative LFC          -1        -   no no      ALDEX2
## 2 Negative LFC          -1        -   no no      ALDEX2
## 3 Positive LFC           1        -   no no      ALDEX2
## 4 Positive LFC           1        -   no no      ALDEX2
## 5 Positive LFC           1        -   no no      ALDEX2
## 6 Negative LFC          -1        -   no no      ALDEX2
## 7 Positive LFC           1        -   no no      ALDEX2
## 
## $res_ancom_noprev_ASV_BH_df
##            bacteria      lfc        pval       qval    direction orientation
## 1 g__Lachnospira_NA 1.872501 4.23892e-06 0.02578111 Positive LFC           1
##   target_level    logFC      pvalue       padj enrich_group prev  BH
## 1          ASV 1.872501 4.23892e-06 0.02578111            1   no yes
##   herramienta
## 1     ANCOMBC
## 
## $res_ancom_noprev_ASV_noBH_df
##                                bacteria       lfc         pval         qval
## 1                g__Fusicatenibacter_NA  2.845322 3.486388e-05 3.486388e-05
## 2                       g__Roseburia_NA  2.394856 3.350481e-03 3.350481e-03
## 3                g__Faecalibacterium_NA  1.961100 4.794771e-02 4.794771e-02
## 4                     g__Lachnospira_NA  1.872501 4.238920e-06 4.238920e-06
## 5                         g__Blautia_NA  1.770128 6.613832e-03 6.613832e-03
## 6  g__[Eubacterium]_ventriosum_group_NA  1.753572 1.062581e-02 1.062581e-02
## 7              s__Bacteroides_stercoris  1.398134 9.368408e-03 9.368408e-03
## 8                 f__Lachnospiraceae_NA  1.294803 3.632977e-04 3.632977e-04
## 9                 f__Lachnospiraceae_NA  1.093063 6.979313e-04 6.979313e-04
## 10                    g__Akkermansia_NA  1.079400 7.098286e-03 7.098286e-03
## 11                o__Oscillospirales_NA -1.041650 1.291157e-02 1.291157e-02
## 12                  g__Solobacterium_NA -1.308280 4.301887e-02 4.301887e-02
## 13            s__Parabacteroides_merdae -1.495987 4.830643e-02 4.830643e-02
## 14                  s__Alistipes_shahii -1.592668 1.419007e-02 1.419007e-02
## 15   g__[Ruminococcus]_torques_group_NA -1.722294 4.158128e-02 4.158128e-02
## 16                     g__Parvimonas_NA -2.001690 9.102879e-03 9.102879e-03
## 17            s__Bifidobacterium_longum -2.006056 3.984108e-02 3.984108e-02
## 18    g__[Ruminococcus]_gnavus_group_NA -2.031157 3.322609e-02 3.322609e-02
## 19               g__Negativibacillus_NA -2.038696 7.068300e-03 7.068300e-03
##       direction orientation target_level     logFC       pvalue         padj
## 1  Positive LFC           1          ASV  2.845322 3.486388e-05 3.486388e-05
## 2  Positive LFC           1          ASV  2.394856 3.350481e-03 3.350481e-03
## 3  Positive LFC           1          ASV  1.961100 4.794771e-02 4.794771e-02
## 4  Positive LFC           1          ASV  1.872501 4.238920e-06 4.238920e-06
## 5  Positive LFC           1          ASV  1.770128 6.613832e-03 6.613832e-03
## 6  Positive LFC           1          ASV  1.753572 1.062581e-02 1.062581e-02
## 7  Positive LFC           1          ASV  1.398134 9.368408e-03 9.368408e-03
## 8  Positive LFC           1          ASV  1.294803 3.632977e-04 3.632977e-04
## 9  Positive LFC           1          ASV  1.093063 6.979313e-04 6.979313e-04
## 10 Positive LFC           1          ASV  1.079400 7.098286e-03 7.098286e-03
## 11 Negative LFC          -1          ASV -1.041650 1.291157e-02 1.291157e-02
## 12 Negative LFC          -1          ASV -1.308280 4.301887e-02 4.301887e-02
## 13 Negative LFC          -1          ASV -1.495987 4.830643e-02 4.830643e-02
## 14 Negative LFC          -1          ASV -1.592668 1.419007e-02 1.419007e-02
## 15 Negative LFC          -1          ASV -1.722294 4.158128e-02 4.158128e-02
## 16 Negative LFC          -1          ASV -2.001690 9.102879e-03 9.102879e-03
## 17 Negative LFC          -1          ASV -2.006056 3.984108e-02 3.984108e-02
## 18 Negative LFC          -1          ASV -2.031157 3.322609e-02 3.322609e-02
## 19 Negative LFC          -1          ASV -2.038696 7.068300e-03 7.068300e-03
##    enrich_group prev BH herramienta
## 1             1   no no     ANCOMBC
## 2             1   no no     ANCOMBC
## 3             1   no no     ANCOMBC
## 4             1   no no     ANCOMBC
## 5             1   no no     ANCOMBC
## 6             1   no no     ANCOMBC
## 7             1   no no     ANCOMBC
## 8             1   no no     ANCOMBC
## 9             1   no no     ANCOMBC
## 10            1   no no     ANCOMBC
## 11            0   no no     ANCOMBC
## 12            0   no no     ANCOMBC
## 13            0   no no     ANCOMBC
## 14            0   no no     ANCOMBC
## 15            0   no no     ANCOMBC
## 16            0   no no     ANCOMBC
## 17            0   no no     ANCOMBC
## 18            0   no no     ANCOMBC
## 19            0   no no     ANCOMBC

9 Paper Plots

## noprev: 80 taxa; 22 methods; NA check passed.
## prev:   84 taxa; 26 methods; NA check passed.

The figures presented in this paper focus exclusively on the prevalence-based approach, as this method provides the most accurate and interpretable representation of the results.

9.1 Paper Figure 1: Cladogram

Figure 1 (main manuscript) displays a cladogram illustrating the top four unique bacterial taxa identified within each toxicity class. The visualization highlights differences in taxonomic composition across classes based on prevalence patterns. Only taxa identified through the prevalence-based analysis are included.

This figure was revised and modified in the final version of the manuscript to improve clarity.

9.2 Supplementary Figure 1: Top Four Bacterial Biomarkers by Toxicity Class

Supplementary Figure 1 presents a Venn diagram illustrating the top four unique bacterial taxa associated with each toxicity class. The comparison is based solely on the prevalence-based analysis.

9.3 Supplementary Figure 2: Toxicity Heatmap

Supplementary Figure 2 shows a heatmap generated using only the prevalence-derived outputs. The figure highlights distinct abundance and distribution patterns of bacterial taxa across the toxicity classes.

9.4 Supplementary Figure 3: Oral Genera Distribution (Alluvial Plot)

Supplementary Figure 3 provides an alluvial plot illustrating the distribution of oral bacterial genera across toxicity categories. This visualization emphasizes shifts in genus-level composition relative to toxicity status.

## No valid elements detected: res_deseq2_noprev_ASV_BH_df, res_deseq2_noprev_ASV_noBH_df, res_deseq2_noprev_BH_df, res_deseq2_noprev_noBH_df, res_deseq2_prev_ASV_BH_df, res_deseq2_prev_BH_df, res_lefse_cpm_ASV_noprev_BH_df, res_lefse_cpm_prev_ASV_BH_df, res_lefse_cpm_prev_BH_df, res_lefse_ra_ASV_prev_BH_df, res_lefse_ra_noprev_BH_df, res_lefse_ra_prev_BH_df, res_zicoseq_noprev_ASV_BH, res_zicoseq_prev_ASV_BH

9.5 Supplementary Figure 4: Oral Bacteria Heatmap (Presence/Absence in NoPrevalence/Prevalence approaches)

Supplementary Figure 4 displays a presence/absence heatmap comparing oral bacterial taxa across the NoPrevalence and Prevalence groups. The figure highlights differences in bacterial occurrence patterns associated with the prevalence classification.

10 Covariates (cofounders evaluation) + Plots

10.1 ALDEx2 (ASV)

In here we increased the threshold until 0.25, but no feature was retained. Same happened with the Genus approach.

## < table of extent 0 x 0 >

10.2 ANCOM-BC

10.2.1 Unfiltered prevalence

10.2.1.1 No BH Correction

10.2.1.2 With BH Correction

10.2.2 Prevalence filtering

10.2.2.1 No BH Correction

10.2.2.2 With BH Correction

10.3 DESeq2

10.3.1 Unfiltered prevalence

10.3.1.1 Genus

## 
## non-significant     significant 
##              82               1

10.3.1.2 ASV

## 
## non-significant     significant 
##              82               1

10.3.2 Prevalence filtering

10.3.2.1 Genus

## 
## non-significant     significant 
##              82               1

10.3.2.2 ASV

## 
## non-significant     significant 
##              82               1

10.4 LinDA

10.4.1 Unfiltered prevalence

10.4.1.1 Genus

## Pseudo-count approach is used.

As LinDA shows a strict criteria, we are showing and retaining the features that shows at least qval < 0.1.

##                           taxon_id     logFC        se        qval direction
## 1 8f3c8cb8544640becafb5777eb7a5858 -4.743002 1.0324987 0.009847236       Low
## 2 9df251784dde31e05f02b2ee1029d71c  4.166458 0.8984472 0.009847236       Sev
##   qval_txt orientation     target_level alpha_grp               Genus
## 1       **          -1 tox (sev vs low)         1       g__Parvimonas
## 2       **           1 tox (sev vs low)         1 g__Fusicatenibacter
##                 label
## 1       g__Parvimonas
## 2 g__Fusicatenibacter

10.4.1.2 ASV

## Pseudo-count approach is used.
## $toxsev
## [1] "8f3c8cb8544640becafb5777eb7a5858" "358362b29cb4e52ca01e236430b08043"
## [3] "80ab652869a99e7aa6aa94f6963a494a" "9df251784dde31e05f02b2ee1029d71c"
## 
## $Sexmale
## character(0)
## 
## $Siterectum
##  [1] "1d122958ee96131b014bbe0fecba81b1" "ed8b4ed1581ca52b0788f8039536313d"
##  [3] "3dc3273da2f47f5733591d9b58cbaf1f" "3b299fbe01dee5fffafb39ed55fd3e78"
##  [5] "e59af00273a37b1b1c4708b5971deda2" "82e5115a8f1ead595da52c79fe587645"
##  [7] "105344b0b04df4d06036c8c91f439255" "6ee9e6771e438dcdfbbbb4094c40edb8"
##  [9] "02bf15068e3e7108f4912f7b2e144b1c" "57339de5c8b87df1e1c2245bb19d8b51"
## [11] "ac07428611cda65054aec775f50cb486" "0991016c9dae25e6dabbd6652d71239a"
## 
## $SiteRIGHT
## character(0)
## 
## $`Sitetransverse-colon`
##  [1] "af3909edeceab197f554e240fe97af83" "f1bf04ec7f805bc268eaddc0c1a1b79a"
##  [3] "cb17ef306146fe98440e15e8181ab0a4" "1c441acb05af7d8fee64aaf52bf3d223"
##  [5] "27f4cbb6d84d5883800387166ffb6906" "f64256b1969b2f5459213b19bef94eaa"
##  [7] "17e293d9412552f2483c73910ff7dd8d" "4400354e1979621239b87e92c5a38a29"
##  [9] "8cf427b8122d1849f3b951790cc3dcd3" "747a284c95e1af0ec02d6582f6565774"
## [11] "6630583dd78eb092ece5e17515eb301d" "50ef4b6ac91b2cddcb87cfe5a3b622f8"
## [13] "2f7a13c13416fff5be074cf32995ec44" "cdc5f1f0fcd4c3dd584022d5e28a86c6"
## [15] "186c9fa38e1428e565a3e55fc8483a88" "2bfe6dc4610795667fcc404fad0e6189"
## [17] "3963890db4dddea7f233911bc8a6a079" "ec55f07c37e1dd699e2ca82a3dd57489"
## [19] "1cf8fc19ce34066f96ad6db0dd004b5a" "9af0eb9868f7dac5bfdcd08fd74d293a"
## [21] "020599fbbb43f8c01624f2976599bdcc" "98b6e2d543691237446ced7c52325a2c"
## [23] "37132fca7c4b6ce7b4c34abfa0261664" "c96ab9fa28e0a9a26ee51ac80bbfafde"
## [25] "1ef02b3a08a0fbd22e0d8fe08ab9b0d2" "6fbf9c0e088b6bab28b6b042b495a56d"
## [27] "7bbd42619a8874e3ad59ada1714da878" "b08743e2e152c9e6c117cf375bb04ac1"
## [29] "3f7fd596312c975960542b051cd5cf43" "5581c82e6046e3be5e70a4380bf56b44"
## [31] "8d4dca0e1ab2475b5c5d37bcb46aee5c" "310023723aab4696cdb707dc6bb4d94a"
## [33] "015ce998953e085dd4d5011f7edd05e2" "257a3feb1cfa43f2c55429863b0f16b3"
## [35] "93bd5b75bbf8507566602238d02b8fcf" "7a4b03d408be56f063ae11449f3044a0"
## 
## $age_groupover70
## character(0)
## 
## $age_groupunder60
## character(0)
##  [1] "8f3c8cb8544640becafb5777eb7a5858" "358362b29cb4e52ca01e236430b08043"
##  [3] "80ab652869a99e7aa6aa94f6963a494a" "9df251784dde31e05f02b2ee1029d71c"
##  [5] "1d122958ee96131b014bbe0fecba81b1" "ed8b4ed1581ca52b0788f8039536313d"
##  [7] "3dc3273da2f47f5733591d9b58cbaf1f" "3b299fbe01dee5fffafb39ed55fd3e78"
##  [9] "e59af00273a37b1b1c4708b5971deda2" "82e5115a8f1ead595da52c79fe587645"
## [11] "105344b0b04df4d06036c8c91f439255" "6ee9e6771e438dcdfbbbb4094c40edb8"
## [13] "02bf15068e3e7108f4912f7b2e144b1c" "57339de5c8b87df1e1c2245bb19d8b51"
## [15] "ac07428611cda65054aec775f50cb486" "0991016c9dae25e6dabbd6652d71239a"
## [17] "af3909edeceab197f554e240fe97af83" "f1bf04ec7f805bc268eaddc0c1a1b79a"
## [19] "cb17ef306146fe98440e15e8181ab0a4" "1c441acb05af7d8fee64aaf52bf3d223"
## [21] "27f4cbb6d84d5883800387166ffb6906" "f64256b1969b2f5459213b19bef94eaa"
## [23] "17e293d9412552f2483c73910ff7dd8d" "4400354e1979621239b87e92c5a38a29"
## [25] "8cf427b8122d1849f3b951790cc3dcd3" "747a284c95e1af0ec02d6582f6565774"
## [27] "6630583dd78eb092ece5e17515eb301d" "50ef4b6ac91b2cddcb87cfe5a3b622f8"
## [29] "2f7a13c13416fff5be074cf32995ec44" "cdc5f1f0fcd4c3dd584022d5e28a86c6"
## [31] "186c9fa38e1428e565a3e55fc8483a88" "2bfe6dc4610795667fcc404fad0e6189"
## [33] "3963890db4dddea7f233911bc8a6a079" "ec55f07c37e1dd699e2ca82a3dd57489"
## [35] "1cf8fc19ce34066f96ad6db0dd004b5a" "9af0eb9868f7dac5bfdcd08fd74d293a"
## [37] "020599fbbb43f8c01624f2976599bdcc" "98b6e2d543691237446ced7c52325a2c"
## [39] "37132fca7c4b6ce7b4c34abfa0261664" "c96ab9fa28e0a9a26ee51ac80bbfafde"
## [41] "1ef02b3a08a0fbd22e0d8fe08ab9b0d2" "6fbf9c0e088b6bab28b6b042b495a56d"
## [43] "7bbd42619a8874e3ad59ada1714da878" "b08743e2e152c9e6c117cf375bb04ac1"
## [45] "3f7fd596312c975960542b051cd5cf43" "5581c82e6046e3be5e70a4380bf56b44"
## [47] "8d4dca0e1ab2475b5c5d37bcb46aee5c" "310023723aab4696cdb707dc6bb4d94a"
## [49] "015ce998953e085dd4d5011f7edd05e2" "257a3feb1cfa43f2c55429863b0f16b3"
## [51] "93bd5b75bbf8507566602238d02b8fcf" "7a4b03d408be56f063ae11449f3044a0"
## Taxonomy Table:     [52 taxa by 7 taxonomic ranks]:
##                                  Domain        Phylum               
## 8f3c8cb8544640becafb5777eb7a5858 "d__Bacteria" "p__Firmicutes"      
## 358362b29cb4e52ca01e236430b08043 "d__Bacteria" "p__Firmicutes"      
## 80ab652869a99e7aa6aa94f6963a494a "d__Bacteria" "p__Firmicutes"      
## 9df251784dde31e05f02b2ee1029d71c "d__Bacteria" "p__Firmicutes"      
## 1d122958ee96131b014bbe0fecba81b1 "d__Bacteria" "p__Actinobacteriota"
## ed8b4ed1581ca52b0788f8039536313d "d__Bacteria" "p__Bacteroidota"    
## 3dc3273da2f47f5733591d9b58cbaf1f "d__Bacteria" "p__Bacteroidota"    
## 3b299fbe01dee5fffafb39ed55fd3e78 "d__Bacteria" "p__Bacteroidota"    
## e59af00273a37b1b1c4708b5971deda2 "d__Bacteria" "p__Synergistota"    
## 82e5115a8f1ead595da52c79fe587645 "d__Bacteria" "p__Firmicutes"      
## 105344b0b04df4d06036c8c91f439255 "d__Bacteria" "p__Firmicutes"      
## 6ee9e6771e438dcdfbbbb4094c40edb8 "d__Bacteria" "p__Firmicutes"      
## 02bf15068e3e7108f4912f7b2e144b1c "d__Bacteria" "p__Firmicutes"      
## 57339de5c8b87df1e1c2245bb19d8b51 "d__Bacteria" "p__Firmicutes"      
## ac07428611cda65054aec775f50cb486 "d__Bacteria" "p__Firmicutes"      
## 0991016c9dae25e6dabbd6652d71239a "d__Bacteria" "p__Firmicutes"      
## af3909edeceab197f554e240fe97af83 "d__Bacteria" "p__Bacteroidota"    
## f1bf04ec7f805bc268eaddc0c1a1b79a "d__Bacteria" "p__Bacteroidota"    
## cb17ef306146fe98440e15e8181ab0a4 "d__Bacteria" "p__Bacteroidota"    
## 1c441acb05af7d8fee64aaf52bf3d223 "d__Bacteria" "p__Firmicutes"      
## 27f4cbb6d84d5883800387166ffb6906 "d__Bacteria" "p__Firmicutes"      
## f64256b1969b2f5459213b19bef94eaa "d__Bacteria" "p__Firmicutes"      
## 17e293d9412552f2483c73910ff7dd8d "d__Bacteria" "p__Firmicutes"      
## 4400354e1979621239b87e92c5a38a29 "d__Bacteria" "p__Firmicutes"      
## 8cf427b8122d1849f3b951790cc3dcd3 "d__Bacteria" "p__Firmicutes"      
## 747a284c95e1af0ec02d6582f6565774 "d__Bacteria" "p__Firmicutes"      
## 6630583dd78eb092ece5e17515eb301d "d__Bacteria" "p__Firmicutes"      
## 50ef4b6ac91b2cddcb87cfe5a3b622f8 "d__Bacteria" "p__Firmicutes"      
## 2f7a13c13416fff5be074cf32995ec44 "d__Bacteria" "p__Firmicutes"      
## cdc5f1f0fcd4c3dd584022d5e28a86c6 "d__Bacteria" "p__Firmicutes"      
## 186c9fa38e1428e565a3e55fc8483a88 "d__Bacteria" "p__Firmicutes"      
## 2bfe6dc4610795667fcc404fad0e6189 "d__Bacteria" "p__Firmicutes"      
## 3963890db4dddea7f233911bc8a6a079 "d__Bacteria" "p__Firmicutes"      
## ec55f07c37e1dd699e2ca82a3dd57489 "d__Bacteria" "p__Firmicutes"      
## 1cf8fc19ce34066f96ad6db0dd004b5a "d__Bacteria" "p__Firmicutes"      
## 9af0eb9868f7dac5bfdcd08fd74d293a "d__Bacteria" "p__Firmicutes"      
## 020599fbbb43f8c01624f2976599bdcc "d__Bacteria" "p__Firmicutes"      
## 98b6e2d543691237446ced7c52325a2c "d__Bacteria" "p__Firmicutes"      
## 37132fca7c4b6ce7b4c34abfa0261664 "d__Bacteria" "p__Firmicutes"      
## c96ab9fa28e0a9a26ee51ac80bbfafde "d__Bacteria" "p__Firmicutes"      
## 1ef02b3a08a0fbd22e0d8fe08ab9b0d2 "d__Bacteria" "p__Firmicutes"      
## 6fbf9c0e088b6bab28b6b042b495a56d "d__Bacteria" "p__Firmicutes"      
## 7bbd42619a8874e3ad59ada1714da878 "d__Bacteria" "p__Firmicutes"      
## b08743e2e152c9e6c117cf375bb04ac1 "d__Bacteria" "p__Firmicutes"      
## 3f7fd596312c975960542b051cd5cf43 "d__Bacteria" "p__Firmicutes"      
## 5581c82e6046e3be5e70a4380bf56b44 "d__Bacteria" "p__Firmicutes"      
## 8d4dca0e1ab2475b5c5d37bcb46aee5c "d__Bacteria" "p__Firmicutes"      
## 310023723aab4696cdb707dc6bb4d94a "d__Bacteria" "p__Firmicutes"      
## 015ce998953e085dd4d5011f7edd05e2 "d__Bacteria" "p__Firmicutes"      
## 257a3feb1cfa43f2c55429863b0f16b3 "d__Bacteria" "p__Firmicutes"      
## 93bd5b75bbf8507566602238d02b8fcf "d__Bacteria" "p__Firmicutes"      
## 7a4b03d408be56f063ae11449f3044a0 "d__Bacteria" "p__Firmicutes"      
##                                  Class              
## 8f3c8cb8544640becafb5777eb7a5858 "c__Clostridia"    
## 358362b29cb4e52ca01e236430b08043 "c__Incertae_Sedis"
## 80ab652869a99e7aa6aa94f6963a494a "c__Clostridia"    
## 9df251784dde31e05f02b2ee1029d71c "c__Clostridia"    
## 1d122958ee96131b014bbe0fecba81b1 "c__Actinobacteria"
## ed8b4ed1581ca52b0788f8039536313d "c__Bacteroidia"   
## 3dc3273da2f47f5733591d9b58cbaf1f "c__Bacteroidia"   
## 3b299fbe01dee5fffafb39ed55fd3e78 "c__Bacteroidia"   
## e59af00273a37b1b1c4708b5971deda2 "c__Synergistia"   
## 82e5115a8f1ead595da52c79fe587645 "c__Bacilli"       
## 105344b0b04df4d06036c8c91f439255 "c__Clostridia"    
## 6ee9e6771e438dcdfbbbb4094c40edb8 "c__Clostridia"    
## 02bf15068e3e7108f4912f7b2e144b1c "c__Clostridia"    
## 57339de5c8b87df1e1c2245bb19d8b51 "c__Clostridia"    
## ac07428611cda65054aec775f50cb486 "c__Clostridia"    
## 0991016c9dae25e6dabbd6652d71239a "c__Clostridia"    
## af3909edeceab197f554e240fe97af83 "c__Bacteroidia"   
## f1bf04ec7f805bc268eaddc0c1a1b79a "c__Bacteroidia"   
## cb17ef306146fe98440e15e8181ab0a4 "c__Bacteroidia"   
## 1c441acb05af7d8fee64aaf52bf3d223 "c__Clostridia"    
## 27f4cbb6d84d5883800387166ffb6906 "c__Clostridia"    
## f64256b1969b2f5459213b19bef94eaa "c__Bacilli"       
## 17e293d9412552f2483c73910ff7dd8d "c__Clostridia"    
## 4400354e1979621239b87e92c5a38a29 "c__Clostridia"    
## 8cf427b8122d1849f3b951790cc3dcd3 "c__Clostridia"    
## 747a284c95e1af0ec02d6582f6565774 "c__Clostridia"    
## 6630583dd78eb092ece5e17515eb301d "c__Clostridia"    
## 50ef4b6ac91b2cddcb87cfe5a3b622f8 "c__Clostridia"    
## 2f7a13c13416fff5be074cf32995ec44 "c__Clostridia"    
## cdc5f1f0fcd4c3dd584022d5e28a86c6 "c__Clostridia"    
## 186c9fa38e1428e565a3e55fc8483a88 "c__Clostridia"    
## 2bfe6dc4610795667fcc404fad0e6189 "c__Clostridia"    
## 3963890db4dddea7f233911bc8a6a079 "c__Clostridia"    
## ec55f07c37e1dd699e2ca82a3dd57489 "c__Clostridia"    
## 1cf8fc19ce34066f96ad6db0dd004b5a "c__Clostridia"    
## 9af0eb9868f7dac5bfdcd08fd74d293a "c__Clostridia"    
## 020599fbbb43f8c01624f2976599bdcc "c__Clostridia"    
## 98b6e2d543691237446ced7c52325a2c "c__Clostridia"    
## 37132fca7c4b6ce7b4c34abfa0261664 "c__Clostridia"    
## c96ab9fa28e0a9a26ee51ac80bbfafde "c__Clostridia"    
## 1ef02b3a08a0fbd22e0d8fe08ab9b0d2 "c__Clostridia"    
## 6fbf9c0e088b6bab28b6b042b495a56d "c__Clostridia"    
## 7bbd42619a8874e3ad59ada1714da878 "c__Clostridia"    
## b08743e2e152c9e6c117cf375bb04ac1 "c__Clostridia"    
## 3f7fd596312c975960542b051cd5cf43 "c__Clostridia"    
## 5581c82e6046e3be5e70a4380bf56b44 "c__Clostridia"    
## 8d4dca0e1ab2475b5c5d37bcb46aee5c "c__Clostridia"    
## 310023723aab4696cdb707dc6bb4d94a "c__Clostridia"    
## 015ce998953e085dd4d5011f7edd05e2 "c__Clostridia"    
## 257a3feb1cfa43f2c55429863b0f16b3 "c__Clostridia"    
## 93bd5b75bbf8507566602238d02b8fcf "c__Clostridia"    
## 7a4b03d408be56f063ae11449f3044a0 "c__Clostridia"    
##                                  Order                                   
## 8f3c8cb8544640becafb5777eb7a5858 "o__Peptostreptococcales-Tissierellales"
## 358362b29cb4e52ca01e236430b08043 "o__DTU014"                             
## 80ab652869a99e7aa6aa94f6963a494a "o__Christensenellales"                 
## 9df251784dde31e05f02b2ee1029d71c "o__Lachnospirales"                     
## 1d122958ee96131b014bbe0fecba81b1 "o__Actinomycetales"                    
## ed8b4ed1581ca52b0788f8039536313d "o__Bacteroidales"                      
## 3dc3273da2f47f5733591d9b58cbaf1f "o__Bacteroidales"                      
## 3b299fbe01dee5fffafb39ed55fd3e78 "o__Bacteroidales"                      
## e59af00273a37b1b1c4708b5971deda2 "o__Synergistales"                      
## 82e5115a8f1ead595da52c79fe587645 "o__Erysipelotrichales"                 
## 105344b0b04df4d06036c8c91f439255 "o__Lachnospirales"                     
## 6ee9e6771e438dcdfbbbb4094c40edb8 "o__Lachnospirales"                     
## 02bf15068e3e7108f4912f7b2e144b1c "o__Oscillospirales"                    
## 57339de5c8b87df1e1c2245bb19d8b51 "o__Christensenellales"                 
## ac07428611cda65054aec775f50cb486 "o__Christensenellales"                 
## 0991016c9dae25e6dabbd6652d71239a "o__Lachnospirales"                     
## af3909edeceab197f554e240fe97af83 "o__Bacteroidales"                      
## f1bf04ec7f805bc268eaddc0c1a1b79a "o__Bacteroidales"                      
## cb17ef306146fe98440e15e8181ab0a4 "o__Bacteroidales"                      
## 1c441acb05af7d8fee64aaf52bf3d223 "o__Peptostreptococcales-Tissierellales"
## 27f4cbb6d84d5883800387166ffb6906 "o__Peptostreptococcales-Tissierellales"
## f64256b1969b2f5459213b19bef94eaa "o__RF39"                               
## 17e293d9412552f2483c73910ff7dd8d "o__Oscillospirales"                    
## 4400354e1979621239b87e92c5a38a29 "o__Lachnospirales"                     
## 8cf427b8122d1849f3b951790cc3dcd3 "o__Clostridia_UCG-014"                 
## 747a284c95e1af0ec02d6582f6565774 "o__Clostridia_UCG-014"                 
## 6630583dd78eb092ece5e17515eb301d "o__Lachnospirales"                     
## 50ef4b6ac91b2cddcb87cfe5a3b622f8 "o__Lachnospirales"                     
## 2f7a13c13416fff5be074cf32995ec44 "o__Lachnospirales"                     
## cdc5f1f0fcd4c3dd584022d5e28a86c6 "o__Oscillospirales"                    
## 186c9fa38e1428e565a3e55fc8483a88 "o__Oscillospirales"                    
## 2bfe6dc4610795667fcc404fad0e6189 "o__Oscillospirales"                    
## 3963890db4dddea7f233911bc8a6a079 "o__Oscillospirales"                    
## ec55f07c37e1dd699e2ca82a3dd57489 "o__Oscillospirales"                    
## 1cf8fc19ce34066f96ad6db0dd004b5a "o__Oscillospirales"                    
## 9af0eb9868f7dac5bfdcd08fd74d293a "o__Oscillospirales"                    
## 020599fbbb43f8c01624f2976599bdcc "o__Oscillospirales"                    
## 98b6e2d543691237446ced7c52325a2c "o__Oscillospirales"                    
## 37132fca7c4b6ce7b4c34abfa0261664 "o__Oscillospirales"                    
## c96ab9fa28e0a9a26ee51ac80bbfafde "o__Christensenellales"                 
## 1ef02b3a08a0fbd22e0d8fe08ab9b0d2 "o__Christensenellales"                 
## 6fbf9c0e088b6bab28b6b042b495a56d "o__Peptococcales"                      
## 7bbd42619a8874e3ad59ada1714da878 "o__Christensenellales"                 
## b08743e2e152c9e6c117cf375bb04ac1 "o__Oscillospirales"                    
## 3f7fd596312c975960542b051cd5cf43 "o__Oscillospirales"                    
## 5581c82e6046e3be5e70a4380bf56b44 "o__Oscillospirales"                    
## 8d4dca0e1ab2475b5c5d37bcb46aee5c "o__Oscillospirales"                    
## 310023723aab4696cdb707dc6bb4d94a "o__Oscillospirales"                    
## 015ce998953e085dd4d5011f7edd05e2 "o__Oscillospirales"                    
## 257a3feb1cfa43f2c55429863b0f16b3 "o__Oscillospirales"                    
## 93bd5b75bbf8507566602238d02b8fcf "o__Oscillospirales"                    
## 7a4b03d408be56f063ae11449f3044a0 "o__Lachnospirales"                     
##                                  Family                                    
## 8f3c8cb8544640becafb5777eb7a5858 "f__Family_XI"                            
## 358362b29cb4e52ca01e236430b08043 "f__DTU014"                               
## 80ab652869a99e7aa6aa94f6963a494a "f__Christensenellaceae"                  
## 9df251784dde31e05f02b2ee1029d71c "f__Lachnospiraceae"                      
## 1d122958ee96131b014bbe0fecba81b1 "f__Actinomycetaceae"                     
## ed8b4ed1581ca52b0788f8039536313d "f__Rikenellaceae"                        
## 3dc3273da2f47f5733591d9b58cbaf1f "f__Tannerellaceae"                       
## 3b299fbe01dee5fffafb39ed55fd3e78 "f__Barnesiellaceae"                      
## e59af00273a37b1b1c4708b5971deda2 "f__Synergistaceae"                       
## 82e5115a8f1ead595da52c79fe587645 "f__Erysipelotrichaceae"                  
## 105344b0b04df4d06036c8c91f439255 "f__Lachnospiraceae"                      
## 6ee9e6771e438dcdfbbbb4094c40edb8 "f__Lachnospiraceae"                      
## 02bf15068e3e7108f4912f7b2e144b1c "f__Oscillospiraceae"                     
## 57339de5c8b87df1e1c2245bb19d8b51 "f__Christensenellaceae"                  
## ac07428611cda65054aec775f50cb486 "f__Christensenellaceae"                  
## 0991016c9dae25e6dabbd6652d71239a "f__Lachnospiraceae"                      
## af3909edeceab197f554e240fe97af83 "f__Rikenellaceae"                        
## f1bf04ec7f805bc268eaddc0c1a1b79a "f__Bacteroidaceae"                       
## cb17ef306146fe98440e15e8181ab0a4 "f__Tannerellaceae"                       
## 1c441acb05af7d8fee64aaf52bf3d223 "f__Anaerovoracaceae"                     
## 27f4cbb6d84d5883800387166ffb6906 "f__Anaerovoracaceae"                     
## f64256b1969b2f5459213b19bef94eaa "f__RF39"                                 
## 17e293d9412552f2483c73910ff7dd8d "f__Butyricicoccaceae"                    
## 4400354e1979621239b87e92c5a38a29 "f__Lachnospiraceae"                      
## 8cf427b8122d1849f3b951790cc3dcd3 "f__Clostridia_UCG-014"                   
## 747a284c95e1af0ec02d6582f6565774 "f__Clostridia_UCG-014"                   
## 6630583dd78eb092ece5e17515eb301d "f__Lachnospiraceae"                      
## 50ef4b6ac91b2cddcb87cfe5a3b622f8 "f__Lachnospiraceae"                      
## 2f7a13c13416fff5be074cf32995ec44 "f__Lachnospiraceae"                      
## cdc5f1f0fcd4c3dd584022d5e28a86c6 "f__Oscillospiraceae"                     
## 186c9fa38e1428e565a3e55fc8483a88 "f__Oscillospiraceae"                     
## 2bfe6dc4610795667fcc404fad0e6189 "f__Ruminococcaceae"                      
## 3963890db4dddea7f233911bc8a6a079 "f__Ruminococcaceae"                      
## ec55f07c37e1dd699e2ca82a3dd57489 "f__Ruminococcaceae"                      
## 1cf8fc19ce34066f96ad6db0dd004b5a "f__Ruminococcaceae"                      
## 9af0eb9868f7dac5bfdcd08fd74d293a "f__Ruminococcaceae"                      
## 020599fbbb43f8c01624f2976599bdcc "f__Ruminococcaceae"                      
## 98b6e2d543691237446ced7c52325a2c "f__Ruminococcaceae"                      
## 37132fca7c4b6ce7b4c34abfa0261664 "f__Oscillospiraceae"                     
## c96ab9fa28e0a9a26ee51ac80bbfafde "f__Christensenellaceae"                  
## 1ef02b3a08a0fbd22e0d8fe08ab9b0d2 "f__Christensenellaceae"                  
## 6fbf9c0e088b6bab28b6b042b495a56d "f__Peptococcaceae"                       
## 7bbd42619a8874e3ad59ada1714da878 "f__Christensenellaceae"                  
## b08743e2e152c9e6c117cf375bb04ac1 "o__Oscillospirales_NA"                   
## 3f7fd596312c975960542b051cd5cf43 "f__Ruminococcaceae"                      
## 5581c82e6046e3be5e70a4380bf56b44 "f__Ruminococcaceae"                      
## 8d4dca0e1ab2475b5c5d37bcb46aee5c "f__Ruminococcaceae"                      
## 310023723aab4696cdb707dc6bb4d94a "f__Ruminococcaceae"                      
## 015ce998953e085dd4d5011f7edd05e2 "f__Ruminococcaceae"                      
## 257a3feb1cfa43f2c55429863b0f16b3 "f__[Eubacterium]_coprostanoligenes_group"
## 93bd5b75bbf8507566602238d02b8fcf "f__[Eubacterium]_coprostanoligenes_group"
## 7a4b03d408be56f063ae11449f3044a0 "f__Lachnospiraceae"                      
##                                  Genus                                     
## 8f3c8cb8544640becafb5777eb7a5858 "g__Parvimonas"                           
## 358362b29cb4e52ca01e236430b08043 "g__DTU014"                               
## 80ab652869a99e7aa6aa94f6963a494a "g__Christensenellaceae_R-7_group"        
## 9df251784dde31e05f02b2ee1029d71c "g__Fusicatenibacter"                     
## 1d122958ee96131b014bbe0fecba81b1 "g__Actinomyces"                          
## ed8b4ed1581ca52b0788f8039536313d "g__Alistipes"                            
## 3dc3273da2f47f5733591d9b58cbaf1f "g__Parabacteroides"                      
## 3b299fbe01dee5fffafb39ed55fd3e78 "g__uncultured"                           
## e59af00273a37b1b1c4708b5971deda2 "g__Cloacibacillus"                       
## 82e5115a8f1ead595da52c79fe587645 "g__Solobacterium"                        
## 105344b0b04df4d06036c8c91f439255 "g__Blautia"                              
## 6ee9e6771e438dcdfbbbb4094c40edb8 "g__Lachnospiraceae_UCG-003"              
## 02bf15068e3e7108f4912f7b2e144b1c "g__UCG-002"                              
## 57339de5c8b87df1e1c2245bb19d8b51 "g__Christensenellaceae_R-7_group"        
## ac07428611cda65054aec775f50cb486 "g__Christensenellaceae_R-7_group"        
## 0991016c9dae25e6dabbd6652d71239a "g__Fusicatenibacter"                     
## af3909edeceab197f554e240fe97af83 "g__Alistipes"                            
## f1bf04ec7f805bc268eaddc0c1a1b79a "g__Bacteroides"                          
## cb17ef306146fe98440e15e8181ab0a4 "g__Parabacteroides"                      
## 1c441acb05af7d8fee64aaf52bf3d223 "g__[Eubacterium]_nodatum_group"          
## 27f4cbb6d84d5883800387166ffb6906 "f__Anaerovoracaceae_NA"                  
## f64256b1969b2f5459213b19bef94eaa "g__RF39"                                 
## 17e293d9412552f2483c73910ff7dd8d "g__Butyricicoccus"                       
## 4400354e1979621239b87e92c5a38a29 "g__Roseburia"                            
## 8cf427b8122d1849f3b951790cc3dcd3 "g__Clostridia_UCG-014"                   
## 747a284c95e1af0ec02d6582f6565774 "g__Clostridia_UCG-014"                   
## 6630583dd78eb092ece5e17515eb301d "f__Lachnospiraceae_NA"                   
## 50ef4b6ac91b2cddcb87cfe5a3b622f8 "g__Sellimonas"                           
## 2f7a13c13416fff5be074cf32995ec44 "f__Lachnospiraceae_NA"                   
## cdc5f1f0fcd4c3dd584022d5e28a86c6 "g__UCG-002"                              
## 186c9fa38e1428e565a3e55fc8483a88 "g__uncultured"                           
## 2bfe6dc4610795667fcc404fad0e6189 "g__Incertae_Sedis"                       
## 3963890db4dddea7f233911bc8a6a079 "g__Caproiciproducens"                    
## ec55f07c37e1dd699e2ca82a3dd57489 "g__uncultured"                           
## 1cf8fc19ce34066f96ad6db0dd004b5a "g__DTU089"                               
## 9af0eb9868f7dac5bfdcd08fd74d293a "g__DTU089"                               
## 020599fbbb43f8c01624f2976599bdcc "g__CAG-352"                              
## 98b6e2d543691237446ced7c52325a2c "g__Subdoligranulum"                      
## 37132fca7c4b6ce7b4c34abfa0261664 "g__Papillibacter"                        
## c96ab9fa28e0a9a26ee51ac80bbfafde "g__Christensenellaceae_R-7_group"        
## 1ef02b3a08a0fbd22e0d8fe08ab9b0d2 "g__Christensenellaceae_R-7_group"        
## 6fbf9c0e088b6bab28b6b042b495a56d "g__uncultured"                           
## 7bbd42619a8874e3ad59ada1714da878 "g__uncultured"                           
## b08743e2e152c9e6c117cf375bb04ac1 "o__Oscillospirales_NA"                   
## 3f7fd596312c975960542b051cd5cf43 "g__Candidatus_Soleaferrea"               
## 5581c82e6046e3be5e70a4380bf56b44 "f__Ruminococcaceae_NA"                   
## 8d4dca0e1ab2475b5c5d37bcb46aee5c "g__Paludicola"                           
## 310023723aab4696cdb707dc6bb4d94a "g__Fournierella"                         
## 015ce998953e085dd4d5011f7edd05e2 "g__uncultured"                           
## 257a3feb1cfa43f2c55429863b0f16b3 "g__[Eubacterium]_coprostanoligenes_group"
## 93bd5b75bbf8507566602238d02b8fcf "g__[Eubacterium]_coprostanoligenes_group"
## 7a4b03d408be56f063ae11449f3044a0 "f__Lachnospiraceae_NA"                   
##                                  Species                              
## 8f3c8cb8544640becafb5777eb7a5858 "g__Parvimonas_NA"                   
## 358362b29cb4e52ca01e236430b08043 "s__unidentified"                    
## 80ab652869a99e7aa6aa94f6963a494a "g__Christensenellaceae_R-7_group_NA"
## 9df251784dde31e05f02b2ee1029d71c "g__Fusicatenibacter_NA"             
## 1d122958ee96131b014bbe0fecba81b1 "s__Schaalia_cardiffensis"           
## ed8b4ed1581ca52b0788f8039536313d "s__Alistipes_shahii"                
## 3dc3273da2f47f5733591d9b58cbaf1f "s__Parabacteroides_merdae"          
## 3b299fbe01dee5fffafb39ed55fd3e78 "g__uncultured_NA"                   
## e59af00273a37b1b1c4708b5971deda2 "s__Cloacibacillus_porcorum"         
## 82e5115a8f1ead595da52c79fe587645 "s__uncultured_bacterium"            
## 105344b0b04df4d06036c8c91f439255 "g__Blautia_NA"                      
## 6ee9e6771e438dcdfbbbb4094c40edb8 "s__uncultured_bacterium"            
## 02bf15068e3e7108f4912f7b2e144b1c "s__uncultured_organism"             
## 57339de5c8b87df1e1c2245bb19d8b51 "g__Christensenellaceae_R-7_group_NA"
## ac07428611cda65054aec775f50cb486 "g__Christensenellaceae_R-7_group_NA"
## 0991016c9dae25e6dabbd6652d71239a "g__Fusicatenibacter_NA"             
## af3909edeceab197f554e240fe97af83 "s__Alistipes_inops"                 
## f1bf04ec7f805bc268eaddc0c1a1b79a "g__Bacteroides_NA"                  
## cb17ef306146fe98440e15e8181ab0a4 "s__Parabacteroides_johnsonii"       
## 1c441acb05af7d8fee64aaf52bf3d223 "s__uncultured_bacterium"            
## 27f4cbb6d84d5883800387166ffb6906 "f__Anaerovoracaceae_NA"             
## f64256b1969b2f5459213b19bef94eaa "g__RF39_NA"                         
## 17e293d9412552f2483c73910ff7dd8d "g__Butyricicoccus_NA"               
## 4400354e1979621239b87e92c5a38a29 "g__Roseburia_NA"                    
## 8cf427b8122d1849f3b951790cc3dcd3 "g__Clostridia_UCG-014_NA"           
## 747a284c95e1af0ec02d6582f6565774 "g__Clostridia_UCG-014_NA"           
## 6630583dd78eb092ece5e17515eb301d "f__Lachnospiraceae_NA"              
## 50ef4b6ac91b2cddcb87cfe5a3b622f8 "s__Lachnoclostridium_phocaeense"    
## 2f7a13c13416fff5be074cf32995ec44 "f__Lachnospiraceae_NA"              
## cdc5f1f0fcd4c3dd584022d5e28a86c6 "s__uncultured_rumen"                
## 186c9fa38e1428e565a3e55fc8483a88 "g__uncultured_NA"                   
## 2bfe6dc4610795667fcc404fad0e6189 "s__uncultured_bacterium"            
## 3963890db4dddea7f233911bc8a6a079 "g__Caproiciproducens_NA"            
## ec55f07c37e1dd699e2ca82a3dd57489 "g__uncultured_NA"                   
## 1cf8fc19ce34066f96ad6db0dd004b5a "g__DTU089_NA"                       
## 9af0eb9868f7dac5bfdcd08fd74d293a "g__DTU089_NA"                       
## 020599fbbb43f8c01624f2976599bdcc "s__uncultured_bacterium"            
## 98b6e2d543691237446ced7c52325a2c "g__Subdoligranulum_NA"              
## 37132fca7c4b6ce7b4c34abfa0261664 "s__uncultured_bacterium"            
## c96ab9fa28e0a9a26ee51ac80bbfafde "g__Christensenellaceae_R-7_group_NA"
## 1ef02b3a08a0fbd22e0d8fe08ab9b0d2 "g__Christensenellaceae_R-7_group_NA"
## 6fbf9c0e088b6bab28b6b042b495a56d "g__uncultured_NA"                   
## 7bbd42619a8874e3ad59ada1714da878 "s__uncultured_bacterium"            
## b08743e2e152c9e6c117cf375bb04ac1 "o__Oscillospirales_NA"              
## 3f7fd596312c975960542b051cd5cf43 "s__uncultured_bacterium"            
## 5581c82e6046e3be5e70a4380bf56b44 "f__Ruminococcaceae_NA"              
## 8d4dca0e1ab2475b5c5d37bcb46aee5c "s__uncultured_bacterium"            
## 310023723aab4696cdb707dc6bb4d94a "s__uncultured_organism"             
## 015ce998953e085dd4d5011f7edd05e2 "s__human_gut"                       
## 257a3feb1cfa43f2c55429863b0f16b3 "s__gut_metagenome"                  
## 93bd5b75bbf8507566602238d02b8fcf "s__Eubacteriaceae_bacterium"        
## 7a4b03d408be56f063ae11449f3044a0 "f__Lachnospiraceae_NA"

As LinDA shows a strict criteria, we are showing and retaining the features that shows at least qval < 0.1.

##                           taxon_id     logFC        se      qval direction
## 1 8f3c8cb8544640becafb5777eb7a5858 -4.496979 1.0140375 0.0369338       Low
## 2 358362b29cb4e52ca01e236430b08043 -1.349932 0.3065764 0.0369338       Low
## 3 80ab652869a99e7aa6aa94f6963a494a -1.349932 0.3065764 0.0369338       Low
## 4 9df251784dde31e05f02b2ee1029d71c  3.995333 0.8513157 0.0369338       Sev
##   qval_txt orientation     target_level alpha_grp               Genus
## 1        *          -1 tox (sev vs low)       0.6       g__Parvimonas
## 2        *          -1 tox (sev vs low)       0.6                <NA>
## 3        *          -1 tox (sev vs low)       0.6                <NA>
## 4        *           1 tox (sev vs low)       0.6 g__Fusicatenibacter
##                              label
## 1                    g__Parvimonas
## 2 358362b29cb4e52ca01e236430b08043
## 3 80ab652869a99e7aa6aa94f6963a494a
## 4              g__Fusicatenibacter

10.4.2 Prevalence filtering

10.4.2.1 Genus

## Pseudo-count approach is used.
##                            taxon_id      logFC        se         pval
## 1  8f3c8cb8544640becafb5777eb7a5858 -5.0425457 0.9997091 2.458894e-05
## 2  2f48ab3f1c064ad0cc3d56842b592031 -4.2707063 1.2954248 2.662394e-03
## 3  1d2b00ff8b7477d2b6c2b8043f7c9d31 -3.5378299 1.0093133 1.555353e-03
## 4  11902be998c8092e98cbe98af2f648af -3.4433137 1.4641530 2.595165e-02
## 5  114d0aefa07e7ab7836011f44281b737 -3.1942050 0.9283221 1.838003e-03
## 6  6343c15ef6a0b28bb8d019ebbcd0a55a -3.0537582 1.4286737 4.143148e-02
## 7  d16a4faef202f2f76497bdd5a6f454b7 -2.7360674 0.8189631 2.377946e-03
## 8  dd9711b1499af66948ee3697968d698c -2.6275397 0.7876375 2.408120e-03
## 9  bd636d8da2dca9bb0b1aa1da7e88890d -2.5717471 0.8020960 3.351394e-03
## 10 cc465b18f37cb7e609cff5ba5ed2bffe -2.4444538 1.1968956 5.063823e-02
## 11 fab1c84c9592bdd07aab0fed81cac039 -2.4174912 0.8210726 6.444136e-03
## 12 b0553a7cf3e72573824b0dacf2747ae5 -2.2029776 0.9421536 2.674508e-02
## 13 df796b258d978ffc2ff31467cda2f996 -2.1201946 1.2613875 1.039172e-01
## 14 478739167292e4f93f1eb7aceb2e3e81 -2.0668687 1.4905123 1.764805e-01
## 15 78c2b8e06a58a133919d9e5fd0cd2da1 -2.0622355 0.6211206 2.507560e-03
## 16 8f6cab269b0e9c75fb86b4e1f1046fbd -2.0442092 0.7664573 1.257079e-02
## 17 134d27c5f6a976ed0421b8e0581d4696 -2.0272602 1.1137261 7.942666e-02
## 18 089a1f1f72e5358297250366530029ed -1.9887385 1.0486525 6.825913e-02
## 19 1c441acb05af7d8fee64aaf52bf3d223 -1.9851335 0.8852837 3.303957e-02
## 20 8ca6b78b226fb18a090b1df68fb9edd9 -1.9724367 1.2576288 1.280263e-01
## 21 4df8473b0be4ef1e59f0b0709c586919 -1.9288835 1.1437874 1.028316e-01
## 22 f047aa1950d5df8c5f644fc3e0816c7a -1.8334816 0.8227024 3.404550e-02
## 23 8e5b84dcbc738d004995e37ef8fb41c0 -1.8074723 1.3098679 1.785409e-01
## 24 b52cd8c186d511e8e4bad72d8bfbf0ed -1.7994300 0.8831979 5.116065e-02
## 25 93b58b0ba0d326e9c8d1a81f8672c16a -1.7704855 0.5631732 3.924189e-03
## 26 310023723aab4696cdb707dc6bb4d94a -1.7587118 0.9968910 8.860782e-02
## 27 e15b6ef1cd643dff3f0649b7baba06e8 -1.7355239 1.1163694 1.312696e-01
## 28 31c7bc067538dcf7916e8b7cbefa44e5 -1.7100218 1.3262709 2.078276e-01
## 29 1c4b27dbe152290cd541bc61cf2e32ff -1.6905130 0.7400302 3.013220e-02
## 30 d0377d9209ed8843c1b85df7bc622a2c -1.6854215 1.3480697 2.215573e-01
## 31 02e1e971506b265659c0f5bf758c24b7 -1.6705025 1.0917879 1.372221e-01
## 32 6dae81dec55dfbe3d76b5f031f625cd7 -1.6579210 0.6646975 1.879734e-02
## 33 3534b4be198dac95882f6e730b647c60 -1.5824800 1.0606952 1.469004e-01
## 34 4d6fe682a4dd9aad8decfab830a193e0 -1.5026061 1.0169023 1.506677e-01
## 35 19ff2a801156e1930f697799d400fa95 -1.4971766 1.2792802 2.517343e-01
## 36 119e51e38e589ed2462a6199598f66da -1.4635241 0.5249683 9.427226e-03
## 37 09ad3f07c79e7bbc78206689dc55492d -1.4476238 1.0825907 1.919249e-01
## 38 f2cbb29998f80ea0d81f9cde98ee136e -1.4437358 1.1487358 2.192084e-01
## 39 7b64ebaf4ce0974210877b3efd3f2406 -1.4097703 1.2714581 2.769573e-01
## 40 0b9d6ae30698fe3144a187eaaafdaef8 -1.3914469 1.2030135 2.571949e-01
## 41 08e91014e3d9c1c1329d013f141cd5bd -1.3813777 1.1416981 2.364209e-01
## 42 9e7125a102db8951720ec539b4e78113 -1.3695076 1.3042695 3.026882e-01
## 43 57bab9df7394498748730d8b1613bf45 -1.2821949 0.5101564 1.799090e-02
## 44 da8b26f82eb70e299518e149ae85f3d9 -1.2620121 1.0270087 2.293661e-01
## 45 7931ef4ba7852a196ef9674b8fc64631 -1.2160249 1.1604286 3.036418e-01
## 46 df3d6113f855e22f2a6d44e60f01baa7 -1.1872410 1.1263427 3.008634e-01
## 47 2215fda718f2798e7e94648af4e01b1c -1.1827621 0.4045982 6.784998e-03
## 48 7a1a7d3362a2f4e3bb2b0f088b9a6b84 -1.1589601 0.6011674 6.407265e-02
## 49 3d48874183df65f00f253cf68234e2ca -1.0817522 1.6824310 5.254746e-01
## 50 ffc36e27c82042664a16bcd4d380b286 -1.0813450 1.7407385 5.394934e-01
## 51 dae883c8aea73654e4481627e671e183 -1.0805421 1.2525887 3.956603e-01
## 52 762c20336116b7b3237d0c9ecadb8759 -1.0588756 0.5177663 5.034802e-02
## 53 15e05255b2aa8ee3524ca61eb207bb18 -1.0415531 0.9084611 2.612909e-01
## 54 41ed97af33c5600c6277ed863c6dbba3 -1.0375216 0.4894613 4.302709e-02
## 55 9b90d16e70c628766310ca5d53bd7e86 -1.0140064 0.6932397 1.546828e-01
## 56 e59af00273a37b1b1c4708b5971deda2 -0.9915057 0.4376135 3.139454e-02
## 57 b61ab877a65a7d6819bd4ac9c60edb91  1.0398881 0.6455833 1.184465e-01
## 58 4a1547af0bbae2aff8f3222d2fba2102  1.0778981 1.3425161 4.287990e-01
## 59 f44c7de06be4c6f9cbe32fa9ffbb30b9  1.0843863 0.9340677 2.554730e-01
## 60 df8f768b0e149cc5c7a6f5cf15cd6fa8  1.0927057 1.0854519 3.227082e-01
## 61 18eb929cf9350e0af65cf863c2786858  1.1745112 0.8693992 1.875295e-01
## 62 4608cf2aa614ad44e2e4c90e5145221e  1.1978138 1.5694424 4.517212e-01
## 63 fb7a1c0b3625f00ee42dbfcbaa001f12  1.3697600 1.7007227 4.273765e-01
## 64 e9398bc0ad9626bc4742c93fd0363bac  1.4467848 1.3477420 2.922170e-01
## 65 4516aa60a483dd8c7bbc57098c45f1a5  1.4495658 1.0769593 1.891064e-01
## 66 3f3a0eaeea9c0690b6ede1b17b4fd8ce  1.5659306 1.5023892 3.061919e-01
## 67 25af29e1b2d121f8aae468d270d75518  1.5948703 1.3240017 2.384477e-01
## 68 b264ac8ff5f9aff74f0b9aa084d9a9f0  1.6077512 1.2479526 2.081824e-01
## 69 707940842caa2afe60491008e04a8173  1.7429777 0.9858580 8.796081e-02
## 70 582fae36f33acd1efdcaf7cacf00ef0a  2.5427157 1.1118125 2.995904e-02
## 71 e655845f5f4ce1633524c0c9a0b15927  2.7093556 1.3233373 5.010882e-02
## 72 9df251784dde31e05f02b2ee1029d71c  4.1136620 0.8902026 7.811807e-05
##           qval direction qval_txt orientation     target_level p_sig q_sig
## 1  0.005212856       Low       **          -1 tox (sev vs low)  TRUE  TRUE
## 2  0.070553444       Low       -†          -1 tox (sev vs low)  TRUE FALSE
## 3  0.070553444       Low       -†          -1 tox (sev vs low)  TRUE FALSE
## 4  0.313810715       Low        †          -1 tox (sev vs low)  TRUE FALSE
## 5  0.070553444       Low       -†          -1 tox (sev vs low)  TRUE FALSE
## 6  0.364869699       Low        †          -1 tox (sev vs low)  TRUE FALSE
## 7  0.070553444       Low       -†          -1 tox (sev vs low)  TRUE FALSE
## 8  0.070553444       Low       -†          -1 tox (sev vs low)  TRUE FALSE
## 9  0.078943954       Low       -†          -1 tox (sev vs low)  TRUE FALSE
## 10 0.374002022       Low                   -1 tox (sev vs low) FALSE FALSE
## 11 0.119868301       Low        †          -1 tox (sev vs low)  TRUE FALSE
## 12 0.313810715       Low        †          -1 tox (sev vs low)  TRUE FALSE
## 13 0.564883266       Low                   -1 tox (sev vs low) FALSE FALSE
## 14 0.688193875       Low                   -1 tox (sev vs low) FALSE FALSE
## 15 0.070553444       Low       -†          -1 tox (sev vs low)  TRUE FALSE
## 16 0.190357742       Low        †          -1 tox (sev vs low)  TRUE FALSE
## 17 0.510256117       Low                   -1 tox (sev vs low) FALSE FALSE
## 18 0.452216725       Low                   -1 tox (sev vs low) FALSE FALSE
## 19 0.313810715       Low        †          -1 tox (sev vs low)  TRUE FALSE
## 20 0.631199479       Low                   -1 tox (sev vs low) FALSE FALSE
## 21 0.564883266       Low                   -1 tox (sev vs low) FALSE FALSE
## 22 0.313810715       Low        †          -1 tox (sev vs low)  TRUE FALSE
## 23 0.688193875       Low                   -1 tox (sev vs low) FALSE FALSE
## 24 0.374002022       Low                   -1 tox (sev vs low) FALSE FALSE
## 25 0.083192799       Low       -†          -1 tox (sev vs low)  TRUE FALSE
## 26 0.521801591       Low                   -1 tox (sev vs low) FALSE FALSE
## 27 0.632480819       Low                   -1 tox (sev vs low) FALSE FALSE
## 28 0.706246917       Low                   -1 tox (sev vs low) FALSE FALSE
## 29 0.313810715       Low        †          -1 tox (sev vs low)  TRUE FALSE
## 30 0.706246917       Low                   -1 tox (sev vs low) FALSE FALSE
## 31 0.646468622       Low                   -1 tox (sev vs low) FALSE FALSE
## 32 0.249064747       Low        †          -1 tox (sev vs low)  TRUE FALSE
## 33 0.650979415       Low                   -1 tox (sev vs low) FALSE FALSE
## 34 0.650979415       Low                   -1 tox (sev vs low) FALSE FALSE
## 35 0.736828627       Low                   -1 tox (sev vs low) FALSE FALSE
## 36 0.153736297       Low        †          -1 tox (sev vs low)  TRUE FALSE
## 37 0.689628285       Low                   -1 tox (sev vs low) FALSE FALSE
## 38 0.706246917       Low                   -1 tox (sev vs low) FALSE FALSE
## 39 0.761799579       Low                   -1 tox (sev vs low) FALSE FALSE
## 40 0.736828627       Low                   -1 tox (sev vs low) FALSE FALSE
## 41 0.715141757       Low                   -1 tox (sev vs low) FALSE FALSE
## 42 0.761799579       Low                   -1 tox (sev vs low) FALSE FALSE
## 43 0.249064747       Low        †          -1 tox (sev vs low)  TRUE FALSE
## 44 0.715082426       Low                   -1 tox (sev vs low) FALSE FALSE
## 45 0.761799579       Low                   -1 tox (sev vs low) FALSE FALSE
## 46 0.761799579       Low                   -1 tox (sev vs low) FALSE FALSE
## 47 0.119868301       Low        †          -1 tox (sev vs low)  TRUE FALSE
## 48 0.438174233       Low                   -1 tox (sev vs low) FALSE FALSE
## 49 0.831347896       Low                   -1 tox (sev vs low) FALSE FALSE
## 50 0.847204527       Low                   -1 tox (sev vs low) FALSE FALSE
## 51 0.772895134       Low                   -1 tox (sev vs low) FALSE FALSE
## 52 0.374002022       Low                   -1 tox (sev vs low) FALSE FALSE
## 53 0.738582142       Low                   -1 tox (sev vs low) FALSE FALSE
## 54 0.364869699       Low        †          -1 tox (sev vs low)  TRUE FALSE
## 55 0.650979415       Low                   -1 tox (sev vs low) FALSE FALSE
## 56 0.313810715       Low        †          -1 tox (sev vs low)  TRUE FALSE
## 57 0.598309770       Sev                    1 tox (sev vs low) FALSE FALSE
## 58 0.786781671       Sev                    1 tox (sev vs low) FALSE FALSE
## 59 0.736828627       Sev                    1 tox (sev vs low) FALSE FALSE
## 60 0.761799579       Sev                    1 tox (sev vs low) FALSE FALSE
## 61 0.689628285       Sev                    1 tox (sev vs low) FALSE FALSE
## 62 0.786781671       Sev                    1 tox (sev vs low) FALSE FALSE
## 63 0.786781671       Sev                    1 tox (sev vs low) FALSE FALSE
## 64 0.761799579       Sev                    1 tox (sev vs low) FALSE FALSE
## 65 0.689628285       Sev                    1 tox (sev vs low) FALSE FALSE
## 66 0.761799579       Sev                    1 tox (sev vs low) FALSE FALSE
## 67 0.715141757       Sev                    1 tox (sev vs low) FALSE FALSE
## 68 0.706246917       Sev                    1 tox (sev vs low) FALSE FALSE
## 69 0.521801591       Sev                    1 tox (sev vs low) FALSE FALSE
## 70 0.313810715       Sev        †           1 tox (sev vs low)  TRUE FALSE
## 71 0.374002022       Sev                    1 tox (sev vs low) FALSE FALSE
## 72 0.008280515       Sev       **           1 tox (sev vs low)  TRUE  TRUE
##              sig alpha_grp                              Genus
## 1        BH<0.05         1                      g__Parvimonas
## 2  p<0.05 (solo)       0.3                   g__Fusobacterium
## 3  p<0.05 (solo)       0.3              g__Peptostreptococcus
## 4  p<0.05 (solo)       0.3     g__[Ruminococcus]_gnavus_group
## 5  p<0.05 (solo)       0.3                   g__Solobacterium
## 6  p<0.05 (solo)       0.3                     g__Barnesiella
## 7  p<0.05 (solo)       0.3              o__Oscillospirales_NA
## 8  p<0.05 (solo)       0.3                   g__Porphyromonas
## 9  p<0.05 (solo)       0.3                      g__uncultured
## 10            ns       0.3                g__Negativibacillus
## 11 p<0.05 (solo)       0.3                g__Izemoplasmatales
## 12 p<0.05 (solo)       0.3        g__Family_XIII_AD3011_group
## 13            ns       0.3                g__Senegalimassilia
## 14            ns       0.3                    g__Butyrivibrio
## 15 p<0.05 (solo)       0.3                      g__Hungatella
## 16 p<0.05 (solo)       0.3             g__Family_XIII_UCG-001
## 17            ns       0.3                   g__Enterorhabdus
## 18            ns       0.3                         g__Gemella
## 19 p<0.05 (solo)       0.3     g__[Eubacterium]_nodatum_group
## 20            ns       0.3                         g__UCG-010
## 21            ns       0.3                         g__UBA1819
## 22 p<0.05 (solo)       0.3                  g__Intestinimonas
## 23            ns       0.3                   g__Butyricimonas
## 24            ns       0.3                   g__Coprobacillus
## 25 p<0.05 (solo)       0.3    g__[Ruminococcus]_torques_group
## 26            ns       0.3                    g__Fournierella
## 27            ns       0.3                         g__UCG-003
## 28            ns       0.3                    g__Anaerostipes
## 29 p<0.05 (solo)       0.3                         g__UCG-009
## 30            ns       0.3               g__Lachnoclostridium
## 31            ns       0.3         g__Lachnospiraceae_UCG-010
## 32 p<0.05 (solo)       0.3        g__Hydrogenoanaerobacterium
## 33            ns       0.3      g__[Eubacterium]_hallii_group
## 34            ns       0.3                 g__Parabacteroides
## 35            ns       0.3 g__[Eubacterium]_ruminantium_group
## 36 p<0.05 (solo)       0.3                     g__Merdibacter
## 37            ns       0.3                   g__Oscillibacter
## 38            ns       0.3     g__Clostridium_sensu_stricto_1
## 39            ns       0.3                      g__Howardella
## 40            ns       0.3                       g__Bilophila
## 41            ns       0.3                      g__uncultured
## 42            ns       0.3                   g__Streptococcus
## 43 p<0.05 (solo)       0.3 g__[Eubacterium]_fissicatena_group
## 44            ns       0.3                  g__Butyricicoccus
## 45            ns       0.3                      g__uncultured
## 46            ns       0.3                      g__Romboutsia
## 47 p<0.05 (solo)       0.3                     g__Eubacterium
## 48            ns       0.3    g__[Clostridium]_innocuum_group
## 49            ns       0.3                  g__Paraprevotella
## 50            ns       0.3            g__Escherichia-Shigella
## 51            ns       0.3                            g__RF39
## 52            ns       0.3            f__Butyricicoccaceae_NA
## 53            ns       0.3                     g__Collinsella
## 54 p<0.05 (solo)       0.3                          g__Dielma
## 55            ns       0.3                      g__uncultured
## 56 p<0.05 (solo)       0.3                  g__Cloacibacillus
## 57            ns       0.3                        g__Moryella
## 58            ns       0.3     g__[Eubacterium]_siraeum_group
## 59            ns       0.3                       g__Olsenella
## 60            ns       0.3                  g__Asteroleplasma
## 61            ns       0.3                     g__Haemophilus
## 62            ns       0.3                      g__Monoglobus
## 63            ns       0.3           g__Phascolarctobacterium
## 64            ns       0.3                    g__Ruminococcus
## 65            ns       0.3                g__Faecalibacterium
## 66            ns       0.3                 g__Bifidobacterium
## 67            ns       0.3                       g__Roseburia
## 68            ns       0.3                    g__Agathobacter
## 69            ns       0.3   g__Lachnospiraceae_NK4A136_group
## 70 p<0.05 (solo)       0.3  g__[Eubacterium]_ventriosum_group
## 71            ns       0.3                     g__Lachnospira
## 72       BH<0.05         1                g__Fusicatenibacter
##                                 label p_only  sig_q
## 1                       g__Parvimonas  FALSE q<0.01
## 2                    g__Fusobacterium   TRUE q<0.10
## 3               g__Peptostreptococcus   TRUE q<0.10
## 4      g__[Ruminococcus]_gnavus_group   TRUE     ns
## 5                    g__Solobacterium   TRUE q<0.10
## 6                      g__Barnesiella   TRUE     ns
## 7               o__Oscillospirales_NA   TRUE q<0.10
## 8                    g__Porphyromonas   TRUE q<0.10
## 9                       g__uncultured   TRUE q<0.10
## 10                g__Negativibacillus  FALSE     ns
## 11                g__Izemoplasmatales   TRUE     ns
## 12        g__Family_XIII_AD3011_group   TRUE     ns
## 13                g__Senegalimassilia  FALSE     ns
## 14                    g__Butyrivibrio  FALSE     ns
## 15                      g__Hungatella   TRUE q<0.10
## 16             g__Family_XIII_UCG-001   TRUE     ns
## 17                   g__Enterorhabdus  FALSE     ns
## 18                         g__Gemella  FALSE     ns
## 19     g__[Eubacterium]_nodatum_group   TRUE     ns
## 20                         g__UCG-010  FALSE     ns
## 21                         g__UBA1819  FALSE     ns
## 22                  g__Intestinimonas   TRUE     ns
## 23                   g__Butyricimonas  FALSE     ns
## 24                   g__Coprobacillus  FALSE     ns
## 25    g__[Ruminococcus]_torques_group   TRUE q<0.10
## 26                    g__Fournierella  FALSE     ns
## 27                         g__UCG-003  FALSE     ns
## 28                    g__Anaerostipes  FALSE     ns
## 29                         g__UCG-009   TRUE     ns
## 30               g__Lachnoclostridium  FALSE     ns
## 31         g__Lachnospiraceae_UCG-010  FALSE     ns
## 32        g__Hydrogenoanaerobacterium   TRUE     ns
## 33      g__[Eubacterium]_hallii_group  FALSE     ns
## 34                 g__Parabacteroides  FALSE     ns
## 35 g__[Eubacterium]_ruminantium_group  FALSE     ns
## 36                     g__Merdibacter   TRUE     ns
## 37                   g__Oscillibacter  FALSE     ns
## 38     g__Clostridium_sensu_stricto_1  FALSE     ns
## 39                      g__Howardella  FALSE     ns
## 40                       g__Bilophila  FALSE     ns
## 41                      g__uncultured  FALSE     ns
## 42                   g__Streptococcus  FALSE     ns
## 43 g__[Eubacterium]_fissicatena_group   TRUE     ns
## 44                  g__Butyricicoccus  FALSE     ns
## 45                      g__uncultured  FALSE     ns
## 46                      g__Romboutsia  FALSE     ns
## 47                     g__Eubacterium   TRUE     ns
## 48    g__[Clostridium]_innocuum_group  FALSE     ns
## 49                  g__Paraprevotella  FALSE     ns
## 50            g__Escherichia-Shigella  FALSE     ns
## 51                            g__RF39  FALSE     ns
## 52            f__Butyricicoccaceae_NA  FALSE     ns
## 53                     g__Collinsella  FALSE     ns
## 54                          g__Dielma   TRUE     ns
## 55                      g__uncultured  FALSE     ns
## 56                  g__Cloacibacillus   TRUE     ns
## 57                        g__Moryella  FALSE     ns
## 58     g__[Eubacterium]_siraeum_group  FALSE     ns
## 59                       g__Olsenella  FALSE     ns
## 60                  g__Asteroleplasma  FALSE     ns
## 61                     g__Haemophilus  FALSE     ns
## 62                      g__Monoglobus  FALSE     ns
## 63           g__Phascolarctobacterium  FALSE     ns
## 64                    g__Ruminococcus  FALSE     ns
## 65                g__Faecalibacterium  FALSE     ns
## 66                 g__Bifidobacterium  FALSE     ns
## 67                       g__Roseburia  FALSE     ns
## 68                    g__Agathobacter  FALSE     ns
## 69   g__Lachnospiraceae_NK4A136_group  FALSE     ns
## 70  g__[Eubacterium]_ventriosum_group   TRUE     ns
## 71                     g__Lachnospira  FALSE     ns
## 72                g__Fusicatenibacter  FALSE q<0.01

As LinDA shows a strict criteria, we are showing and retaining the features that shows at least qval < 0.1.

10.4.2.2 ASV

## Pseudo-count approach is used.

As LinDA shows a strict criteria, we are showing and retaining the features that shows at least qval < 0.1.

## Warning: No shared levels found between `names(values)` of the manual scale and the
## data's alpha values.
## No shared levels found between `names(values)` of the manual scale and the
## data's alpha values.
## No shared levels found between `names(values)` of the manual scale and the
## data's alpha values.
## No shared levels found between `names(values)` of the manual scale and the
## data's alpha values.

10.5 ZicoSeq

10.5.1 Unfiltered prevalence

10.5.1.1 Genus

## [1] "matrix" "array"
## For proportion and other data types,  posterior sampling will not be performed!
## The data has  36  samples and  144  features will be tested!
## On average,  1  outlier counts will be replaced for each feature!
## Permutation testing ...
## .........
## Completed!

10.5.1.2 ASV

## [1] "matrix" "array"
## For proportion and other data types,  posterior sampling will not be performed!
## The data has  36  samples and  224  features will be tested!
## On average,  1  outlier counts will be replaced for each feature!
## Permutation testing ...
## .........
## Completed!

10.5.2 Prevalence filtering

10.5.2.1 Genus

## [1] "matrix" "array"
## For proportion and other data types,  posterior sampling will not be performed!
## The data has  36  samples and  224  features will be tested!
## On average,  1  outlier counts will be replaced for each feature!
## Permutation testing ...
## .........
## Completed!

10.5.2.2 ASV

## [1] "matrix" "array"
## For proportion and other data types,  posterior sampling will not be performed!
## The data has  36  samples and  224  features will be tested!
## On average,  1  outlier counts will be replaced for each feature!
## Permutation testing ...
## .........
## Completed!

11 Findings

This study integrates clinical stratification with microbiome profiling to answer the research questions from section Research Questions.

[R] Key Results

These results suggest that two principal bacteria could be set as bacterial biomarkers: Parvimonas for low toxicity classification; and Fusicatenibacter for severe toxicity classification.

Another bacterial genera could be useful to keep in future evaluations such as - Lachnospira and Lachnospiraceae NK4A136 - Family XIII AD3011 group - Ruminococcus torques and Ruminococcus gnavus - Eubacterium ventriosum could be of interest in severe toxicity. - Eubacterium genera members could help if the taxonomical level arrives to species or strain, in order to be splitted between the two toxicity levels evaluated. - This could help with Oscillospirales order members too.

Interestingly, Blautia species or strains could be relevant to separate between these two toxicities classes. That is one of the reasons why future work will be focus on MiniON Oxford Nanopore sequencing.

[M] Methodological Note

11.0.1 Methods without features when applying BH correction

In the case of ALDEx2 and LEfSE, no features were retained when the BH correction was applied, independently that if this was of the unfiltered or filtered by prevalence. Additionally, ALDEx2 was one of the method that allowed the cofounders addition. Surprisedly, it didn’t return results when BH correction was applied.

11.0.2 Confounders analysis final thoughts

  • ANCOM-BC reported the highest number of new biomarkers to be kept in mind for future studies. Meanwhile, ALDEx2 still did not return any significant biomarkers.

  • LinDA returned some relevant bacterial biomarkers previously related with colorectal cancer prognosis such as Bifidobacterium or Streptococcus.

-ZicoSeq and DESeq2 also kept similar results in number and in bacterial detection reported before the use of confounders.

  • LEfSe could not be used on this evaluation.

When the initial analysis started with the difficulties this research faced (low dataset size), the use of confounders is a great strategy to get meaningful outputs. So, the next step will also evaluated this subsection of the report in more detail.

12 Conclusions

[*] Conclusion In conclusion, this study demonstrates the utility of integrating multiple DAA methodologies and normalisation strategies to identify microbial biomarkers associated with chemotherapy-related toxicity in colorectal cancer. Of note, compositionally aware methods such as ANCOM-BC, LinDA, and ZicoSeq proved most sensitive, detecting subtle microbial shifts in cohorts with minimal clinical divergence. DESeq2 reported a medium conservative and sensitive results, although only one feature was retained in the filtered, BH correction test. The more surprising tool was LEfSe, which reported no features after BH correction in both filtered and unfiltered approaches (even with two different normalizations: CPM vs TSS). Finally, ALDEx2 was the most conservative, reporting no features when FDR control was applied.

Despite differences in statistical assumptions and data transformations, the consistent detection of and across analytical frameworks and filtering approaches underscores their potential as key indicators of chemotherapy response. Also, other bacteria, such as the , , , and genera, could be considered for further study. Notably, the oral bacteria could be a validated CRC-associated biomarker, emerges as a promising new target for investigation.

However, future work should increase the sample size and explore other sequencing technologies, such as Oxford Nanopore (ONT). Applying complementary DAA methodologies or machine learning frameworks to validate these microbial signatures will also be essential to elucidate their roles.

[!] Limitation

  • Limitation: Small sample size and unbalance dataset.

A key limitation is the small sample size, which leads to sparse cells for some locations and reduces power, precision and generalisability. Accordingly, estimates should be interpreted cautiously, with emphasis on effect sizes and uncertainty intervals rather than sole reliance on p-values; results warrant external validation and sensitivity analyses across DAA pipelines. We position these findings within prior literature [@cite1; @cite2] and outline implications for patient stratification and future study design.

[>>] Future Work

  • Future work: Replication in larger trials with Oxford Nanopore Sequencing approach to be able to get species and strains specialties. A more balance dataset would be tried to be selected and controls will be included as an additional class to see if there is any huge contrast between each toxicity class and controls. Additionally, previously described bacteria associated with CRC such as Parvimonas, Bacteroides or Fusicatenibacter will be search in more detail to see if there is a correlation not observed in this study.

13 Paper Reference

This report is explained and discussed in the paper entitled: “Evaluating DAA methodologies to detect microbiome taxa associated with Chemotherapy toxicity in a CRC cohort”. Currently under submission.

14 NOTES

NOTE (1): The dataset evaluated in this research was a subset of the dataset analysed in previous paper such as the one mentioned in NOTE (2) (below).

NOTE (2): The ancombc global function code was adapted from the papers:

Conde-Pérez et al. The multispecies microbial cluster of Fusobacterium, Parvimonas, Bacteroides and Faecalibacterium as a precision biomarker for colorectal cancer diagnosis. (2024). Molecular Oncology. DOI: https://doi.org/10.1002/1878-0261.13604.

Conde-Pérez, K., Buetas, E., Aja-Macaya, P., Martin-De Arribas, E., Iglesias-Corrás, I., Trigo-Tasende, N., Nasser-Ali, M., Estévez, L. S., Rumbo-Feal, S., Otero-Alén, B., Noguera, J. F., Concha, Á., Pardiñas-López, S., Carda-Diéguez, M., Gómez-Randulfe, I., Martínez-Lago, N., Ladra, S., Aparicio, L. A., Bou, G., . . . Poza, M. (2024). Parvimonas micra can translocate from the subgingival sulcus of the human oral cavity to colorectal adenocarcinoma. Molecular Oncology, 18(5), 1143-1173. DOI: https://doi.org/10.1002/1878-0261.13506.

15 Session Info

## R version 4.5.1 (2025-06-13)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
## 
## locale:
##  [1] LC_CTYPE=es_ES.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=es_ES.UTF-8        LC_COLLATE=es_ES.UTF-8    
##  [5] LC_MONETARY=es_ES.UTF-8    LC_MESSAGES=es_ES.UTF-8   
##  [7] LC_PAPER=es_ES.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Europe/Madrid
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    grid      stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] ggVennDiagram_1.5.4         rsample_1.3.1              
##  [3] tidygraph_1.3.1             ape_5.8-1                  
##  [5] ggtree_3.16.0               scales_1.4.0               
##  [7] igraph_2.2.1                ggraph_2.2.1               
##  [9] viridis_0.6.5               viridisLite_0.4.2          
## [11] tibble_3.3.0                pheatmap_1.0.13            
## [13] GUniFrac_1.8                LinDA_0.2.0                
## [15] DESeq2_1.48.1               SummarizedExperiment_1.38.1
## [17] Biobase_2.68.0              MatrixGenerics_1.20.0      
## [19] matrixStats_1.5.0           GenomicRanges_1.60.0       
## [21] GenomeInfoDb_1.44.3         IRanges_2.42.0             
## [23] S4Vectors_0.46.0            BiocGenerics_0.54.1        
## [25] generics_0.1.4              ALDEx2_1.40.0              
## [27] latticeExtra_0.6-30         lattice_0.22-7             
## [29] zCompositions_1.5.0-5       survival_3.8-3             
## [31] truncnorm_1.0-9             MASS_7.3-65                
## [33] ANCOMBC_2.10.1              microbiome_1.30.0          
## [35] microbiomeMarker_1.13.2     vegan_2.7-2                
## [37] permute_0.9-8               tidytree_0.4.6             
## [39] phyloseq_1.52.0             glue_1.8.0                 
## [41] ggvenn_0.1.10               colorblindcheck_1.0.2      
## [43] randomcoloR_1.1.0.1         plotly_4.11.0              
## [45] camcorder_0.1.0             reprtree_0.6               
## [47] plotrix_3.8-4               tree_1.0-44                
## [49] randomForest_4.7-1.2        ggalluvial_0.12.5          
## [51] egg_0.4.5                   gridExtra_2.3              
## [53] RColorBrewer_1.1-3          VennDiagram_1.7.3          
## [55] futile.logger_1.4.3         readr_2.1.5                
## [57] patchwork_1.3.1             ggh4x_0.3.1                
## [59] gtools_3.9.5                stringr_1.5.2              
## [61] janitor_2.2.1               ggrepel_0.9.6              
## [63] ggplot2_4.0.0               dplyr_1.1.4                
## [65] plyr_1.8.9                  purrr_1.1.0                
## [67] tidyr_1.3.1                 qiime2R_0.99.6             
## [69] formatR_1.14                reshape2_1.4.4             
## [71] data.table_1.17.8           xtable_1.8-4               
## [73] devtools_2.4.5              usethis_3.1.0              
## [75] readxl_1.4.5                markdown_2.0               
## [77] rmarkdown_2.30              kableExtra_1.4.0           
## [79] knitr_1.50                 
## 
## loaded via a namespace (and not attached):
##   [1] coin_1.4-3              gld_2.6.7               urlchecker_1.0.1       
##   [4] nnet_7.3-20             DT_0.33                 Biostrings_2.76.0      
##   [7] TH.data_1.1-3           vctrs_0.6.5             energy_1.7-12          
##  [10] digest_0.6.37           png_0.1-8               shape_1.4.6.1          
##  [13] proxy_0.4-27            Exact_3.3               parallelly_1.45.0      
##  [16] deldir_2.0-4            magick_2.8.7            httpuv_1.6.16          
##  [19] foreach_1.5.2           withr_3.0.2             xfun_0.53              
##  [22] ggfun_0.1.9             ellipsis_0.3.2          doRNG_1.8.6.2          
##  [25] memoise_2.0.1           profvis_0.4.0           gmp_0.7-5              
##  [28] systemfonts_1.3.1       ragg_1.4.0              zoo_1.8-14             
##  [31] GlobalOptions_0.1.2     V8_6.0.4                Formula_1.2-5          
##  [34] promises_1.4.0          otel_0.2.0              httr_1.4.7             
##  [37] globals_0.18.0          rhdf5filters_1.20.0     rhdf5_2.52.1           
##  [40] rstudioapi_0.17.1       UCSC.utils_1.4.0        miniUI_0.1.2           
##  [43] base64enc_0.1-3         curl_7.0.0              polyclip_1.10-7        
##  [46] statip_0.2.3            quadprog_1.5-8          GenomeInfoDbData_1.2.14
##  [49] SparseArray_1.8.0       ade4_1.7-23             doParallel_1.0.17      
##  [52] evaluate_1.0.5          S4Arrays_1.8.1          gifski_1.32.0-2        
##  [55] Rfast_2.1.5.1           hms_1.1.3               glmnet_4.1-9           
##  [58] colorspace_2.1-2        magrittr_2.0.4          snakecase_0.11.1       
##  [61] modeltools_0.2-24       later_1.4.4             class_7.3-23           
##  [64] Hmisc_5.2-3             pillar_1.11.1           nlme_3.1-168           
##  [67] iterators_1.0.14        caTools_1.18.3          compiler_4.5.1         
##  [70] plotROC_2.3.1           stringi_1.8.7           biomformat_1.36.0      
##  [73] DescTools_0.99.60       stabledist_0.7-2        minqa_1.2.8            
##  [76] lubridate_1.9.4         crayon_1.5.3            abind_1.4-8            
##  [79] timeSeries_4041.111     gridGraphics_0.5-1      emdbook_1.3.13         
##  [82] locfit_1.5-9.12         haven_2.5.5             graphlayouts_1.2.2     
##  [85] bit_4.6.0               rootSolve_1.8.2.4       sandwich_3.1-1         
##  [88] libcoin_1.0-10          codetools_0.2-20        multcomp_1.4-28        
##  [91] textshaping_1.0.1       directlabels_2025.6.24  bslib_0.9.0            
##  [94] e1071_1.7-16            lmom_3.2                GetoptLong_1.0.5       
##  [97] multtest_2.64.0         mime_0.13               splines_4.5.1          
## [100] metagenomeSeq_1.50.0    circlize_0.4.16         Rcpp_1.1.0             
## [103] cellranger_1.1.0        interp_1.1-6            utf8_1.2.6             
## [106] clue_0.3-66             apeglm_1.30.0           fBasics_4041.97        
## [109] lme4_1.1-37             fs_1.6.6                listenv_0.9.1          
## [112] checkmate_2.3.2         Rdpack_2.6.4            pkgbuild_1.4.8         
## [115] expm_1.0-0              gsl_2.1-8               ggplotify_0.1.2        
## [118] Matrix_1.7-4            statmod_1.5.1           tzdb_0.5.0             
## [121] svglite_2.2.1           tweenr_2.0.3            pkgconfig_2.0.3        
## [124] tools_4.5.1             cachem_1.1.0            rbibutils_2.3          
## [127] numDeriv_2016.8-1.1     zigg_0.0.2              rmutil_1.1.10          
## [130] fastmap_1.2.0           sass_0.4.10             coda_0.19-4.1          
## [133] stable_1.1.6            rpart_4.1.24            farver_2.1.2           
## [136] reformulas_0.4.1        mgcv_1.9-3              yaml_2.3.10            
## [139] spatial_7.3-18          foreign_0.8-90          cli_3.6.5              
## [142] lifecycle_1.0.4         mvtnorm_1.3-3           lambda.r_1.2.4         
## [145] sessioninfo_1.2.3       backports_1.5.0         modeest_2.4.0          
## [148] BiocParallel_1.42.1     timechange_0.3.0        gtable_0.3.6           
## [151] rjson_0.2.23            parallel_4.5.1          limma_3.64.1           
## [154] CVXR_1.0-15             jsonlite_2.0.0          bitops_1.0-9           
## [157] bit64_4.6.0-1           Rtsne_0.17              yulab.utils_0.2.0      
## [160] RcppParallel_5.1.10     bdsmatrix_1.3-7         futile.options_1.0.1   
## [163] jquerylib_0.1.4         timeDate_4041.110       lazyeval_0.2.2         
## [166] shiny_1.11.1            htmltools_0.5.8.1       Wrench_1.26.0          
## [169] XVector_0.48.0          treeio_1.32.0           jpeg_0.1-11            
## [172] boot_1.3-32             R6_2.6.1                gplots_3.2.0           
## [175] labeling_0.4.3          Rmpfr_1.1-0             forcats_1.0.0          
## [178] bbmle_1.0.25.1          cluster_2.1.8.1         rngtools_1.5.2         
## [181] pkgload_1.4.0           Rhdf5lib_1.30.0         aplot_0.2.8            
## [184] nloptr_2.2.1            DelayedArray_0.34.1     tidyselect_1.2.1       
## [187] htmlTable_2.4.3         ggforce_0.5.0           inline_0.3.21          
## [190] xml2_1.3.8              future_1.58.0           KernSmooth_2.23-26     
## [193] S7_0.2.0                furrr_0.3.1             rsvg_2.6.2             
## [196] htmlwidgets_1.6.4       ComplexHeatmap_2.24.1   rlang_1.1.6            
## [199] lmerTest_3.1-3          remotes_2.5.0